8000 Add asymptotes for benchmarking framework by JeremyRubin · Pull Request #17375 · bitcoin/bitcoin · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Add asymptotes for benchmarking framework #17375

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

JeremyRubin
Copy link
Contributor

It can be pretty convenient for certain classes of benchmark to be able to run it with dynamic parameters, to generate curves of magnitudes runtimes.

This PR adds this functionality and shows an example using it.

This functionality is a bit useless unless you also combine it with a filter to ensure the params are meaningful for the target asymptotic benchmark.

@fanquake fanquake added the Tests label Nov 5, 2019
@@ -25,6 +26,7 @@ static void SetupBenchArgs()
gArgs.AddArg("-evals=<n>", strprintf("Number of measurement evaluations to perform. (default: %u)", DEFAULT_BENCH_EVALUATIONS), ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
gArgs.AddArg("-filter=<regex>", strprintf("Regular expression filter to select benchmark by name (default: %s)", DEFAULT_BENCH_FILTER), ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
gArgs.AddArg("-scaling=<n>", strprintf("Scaling factor for benchmark's runtime (default: %u)", DEFAULT_BENCH_SCALING), ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
gArgs.AddArg("-asymptote=n", strprintf("Asymptotic factors for benchmark's runtime, positional (default: \"\")"), ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add some short summary, as to what 'asymptotic factors' mean in this context? I don't think it's self-explanatory, this argument help gives me zero context on how to use it in practice 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I squash-pushed improved documentation for this parameter, which includes an example of how to use it.

@maflcko
Copy link
Member
maflcko commented Nov 7, 2019

Not sure how useful the general framework is. A smaller patch would be to just copy past the BENCHMARK(ComplexMemPool, 1); line and hardcode the "asymptotes". Or if you like, make the benchmark settings dynamic, so that an asymptote can be created by running bench_bitcoin -filter=... -scale_asymptotes_factor=5 to multiply all dynamic settings in the selected/filtered bench by that factor.

@JeremyRubin
Copy link
Contributor Author

@MarcoFalke that's insufficient for the example given. In that benchmark, all different sorts of asymptotic behaviors can be triggered with different growth factors.

I typically run it something like this:

for x in 10 100 1000 10000 100000 1000000; do ./src/bench/bench_bitcoin -asymptote=100 -asymptote=100 -asymptote=$x -asymptote=10 --evals=1; done | grep Com --line-buffered | cut -d, -f6

Scaling all the factors doesn't let you express nuances.

The other issue is that if you just copy the existing Benchmark stuff you lose the 'runs in one second" property.

@JeremyRubin JeremyRubin force-pushed the asymptotic-benchmarks branch from f160100 to 32862e5 Compare November 11, 2019 19:45
@DrahtBot
Copy link
Contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #17781 (rpc: Remove mempool global from miner by MarcoFalke)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@DrahtBot
Copy link
Contributor
DrahtBot commented Jan 2, 2020
Needs rebase

@JeremyRubin
Copy link
Contributor Author

closed in favor of #18011

martinus added a commit to martinus/bitcoin that referenced this pull request Feb 20, 2020
This adds support to calculate asymptotic complexity of a benchmark.
This is similar to bitcoin#17375, but currently only one asymptote is
supported, and I have added support in the benchmark `ComplexMemPool`
as an example.

Usage is e.g. like this:

```
./bench_bitcoin -filter=ComplexMemPool -asymptote=25,50,100,200,400,600,800
```

This runs the benchmark `ComplexMemPool` several times but with
different complexityN settings. The benchmark can extract that number
and use it accordingly. Here, it's used for `childTxs`. The output is
this:

| complexityN |               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |     total | benchmark
|------------:|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|----------:|:----------
|          25 |        1,064,241.00 |              939.64 |    1.4% |    3,960,279.00 |    2,829,708.00 |  1.400 |      0.01 | `ComplexMemPool`
|          50 |        1,579,530.00 |              633.10 |    1.0% |    6,231,810.00 |    4,412,674.00 |  1.412 |      0.02 | `ComplexMemPool`
|         100 |        4,022,774.00 |              248.58 |    0.6% |   16,544,406.00 |   11,889,535.00 |  1.392 |      0.04 | `ComplexMemPool`
|         200 |       15,390,986.00 |               64.97 |    0.2% |   63,904,254.00 |   47,731,705.00 |  1.339 |      0.17 | `ComplexMemPool`
|         400 |       69,394,711.00 |               14.41 |    0.1% |  272,602,461.00 |  219,014,691.00 |  1.245 |      0.76 | `ComplexMemPool`
|         600 |      168,977,165.00 |                5.92 |    0.1% |  639,108,082.00 |  535,316,887.00 |  1.194 |      1.86 | `ComplexMemPool`
|         800 |      310,109,077.00 |                3.22 |    0.1% |1,149,134,246.00 |  984,620,812.00 |  1.167 |      3.41 | `ComplexMemPool`

|   coefficient |   err% | complexity
|--------------:|-------:|------------
|   4.78486e-07 |   4.5% | O(n^2)
|   6.38557e-10 |  21.7% | O(n^3)
|   3.42338e-05 |  38.0% | O(n log n)
|   0.000313914 |  46.9% | O(n)
|     0.0129823 | 114.4% | O(log n)
|     0.0815055 | 133.8% | O(1)

The best fitting curve is O(n^2), so the algorithm seems to scale
quadratic with `childTxs` in the range 25 to 800.
martinus added a commit to martinus/bitcoin that referenced this pull request Mar 7, 2020
This adds support to calculate asymptotic complexity of a benchmark.
This is similar to bitcoin#17375, but currently only one asymptote is
supported, and I have added support in the benchmark `ComplexMemPool`
as an example.

Usage is e.g. like this:

```
./bench_bitcoin -filter=ComplexMemPool -asymptote=25,50,100,200,400,600,800
```

This runs the benchmark `ComplexMemPool` several times but with
different complexityN settings. The benchmark can extract that number
and use it accordingly. Here, it's used for `childTxs`. The output is
this:

| complexityN |               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |     total | benchmark
|------------:|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|----------:|:----------
|          25 |        1,064,241.00 |              939.64 |    1.4% |    3,960,279.00 |    2,829,708.00 |  1.400 |      0.01 | `ComplexMemPool`
|          50 |        1,579,530.00 |              633.10 |    1.0% |    6,231,810.00 |    4,412,674.00 |  1.412 |      0.02 | `ComplexMemPool`
|         100 |        4,022,774.00 |              248.58 |    0.6% |   16,544,406.00 |   11,889,535.00 |  1.392 |      0.04 | `ComplexMemPool`
|         200 |       15,390,986.00 |               64.97 |    0.2% |   63,904,254.00 |   47,731,705.00 |  1.339 |      0.17 | `ComplexMemPool`
|         400 |       69,394,711.00 |               14.41 |    0.1% |  272,602,461.00 |  219,014,691.00 |  1.245 |      0.76 | `ComplexMemPool`
|         600 |      168,977,165.00 |                5.92 |    0.1% |  639,108,082.00 |  535,316,887.00 |  1.194 |      1.86 | `ComplexMemPool`
|         800 |      310,109,077.00 |                3.22 |    0.1% |1,149,134,246.00 |  984,620,812.00 |  1.167 |      3.41 | `ComplexMemPool`

|   coefficient |   err% | complexity
|--------------:|-------:|------------
|   4.78486e-07 |   4.5% | O(n^2)
|   6.38557e-10 |  21.7% | O(n^3)
|   3.42338e-05 |  38.0% | O(n log n)
|   0.000313914 |  46.9% | O(n)
|     0.0129823 | 114.4% | O(log n)
|     0.0815055 | 133.8% | O(1)

The best fitting curve is O(n^2), so the algorithm seems to scale
quadratic with `childTxs` in the range 25 to 800.
martinus added a commit to martinus/bitcoin that referenced this pull request Mar 8, 2020
This adds support to calculate asymptotic complexity of a benchmark.
This is similar to bitcoin#17375, but currently only one asymptote is
supported, and I have added support in the benchmark `ComplexMemPool`
as an example.

Usage is e.g. like this:

```
./bench_bitcoin -filter=ComplexMemPool -asymptote=25,50,100,200,400,600,800
```

This runs the benchmark `ComplexMemPool` several times but with
different complexityN settings. The benchmark can extract that number
and use it accordingly. Here, it's used for `childTxs`. The output is
this:

| complexityN |               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |     total | benchmark
|------------:|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|----------:|:----------
|          25 |        1,064,241.00 |              939.64 |    1.4% |    3,960,279.00 |    2,829,708.00 |  1.400 |      0.01 | `ComplexMemPool`
|          50 |        1,579,530.00 |              633.10 |    1.0% |    6,231,810.00 |    4,412,674.00 |  1.412 |      0.02 | `ComplexMemPool`
|         100 |        4,022,774.00 |              248.58 |    0.6% |   16,544,406.00 |   11,889,535.00 |  1.392 |      0.04 | `ComplexMemPool`
|         200 |       15,390,986.00 |               64.97 |    0.2% |   63,904,254.00 |   47,731,705.00 |  1.339 |      0.17 | `ComplexMemPool`
|         400 |       69,394,711.00 |               14.41 |    0.1% |  272,602,461.00 |  219,014,691.00 |  1.245 |      0.76 | `ComplexMemPool`
|         600 |      168,977,165.00 |                5.92 |    0.1% |  639,108,082.00 |  535,316,887.00 |  1.194 |      1.86 | `ComplexMemPool`
|         800 |      310,109,077.00 |                3.22 |    0.1% |1,149,134,246.00 |  984,620,812.00 |  1.167 |      3.41 | `ComplexMemPool`

|   coefficient |   err% | complexity
|--------------:|-------:|------------
|   4.78486e-07 |   4.5% | O(n^2)
|   6.38557e-10 |  21.7% | O(n^3)
|   3.42338e-05 |  38.0% | O(n log n)
|   0.000313914 |  46.9% | O(n)
|     0.0129823 | 114.4% | O(log n)
|     0.0815055 | 133.8% | O(1)

The best fitting curve is O(n^2), so the algorithm seems to scale
quadratic with `childTxs` in the range 25 to 800.
martinus added a commit to martinus/bitcoin that referenced this pull request Mar 28, 2020
This adds support to calculate asymptotic complexity of a benchmark.
This is similar to bitcoin#17375, but currently only one asymptote is
supported, and I have added support in the benchmark `ComplexMemPool`
as an example.

Usage is e.g. like this:

```
./bench_bitcoin -filter=ComplexMemPool -asymptote=25,50,100,200,400,600,800
```

This runs the benchmark `ComplexMemPool` several times but with
different complexityN settings. The benchmark can extract that number
and use it accordingly. Here, it's used for `childTxs`. The output is
this:

| complexityN |               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |     total | benchmark
|------------:|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|----------:|:----------
|          25 |        1,064,241.00 |              939.64 |    1.4% |    3,960,279.00 |    2,829,708.00 |  1.400 |      0.01 | `ComplexMemPool`
|          50 |        1,579,530.00 |              633.10 |    1.0% |    6,231,810.00 |    4,412,674.00 |  1.412 |      0.02 | `ComplexMemPool`
|         100 |        4,022,774.00 |              248.58 |    0.6% |   16,544,406.00 |   11,889,535.00 |  1.392 |      0.04 | `ComplexMemPool`
|         200 |       15,390,986.00 |               64.97 |    0.2% |   63,904,254.00 |   47,731,705.00 |  1.339 |      0.17 | `ComplexMemPool`
|         400 |       69,394,711.00 |               14.41 |    0.1% |  272,602,461.00 |  219,014,691.00 |  1.245 |      0.76 | `ComplexMemPool`
|         600 |      168,977,165.00 |                5.92 |    0.1% |  639,108,082.00 |  535,316,887.00 |  1.194 |      1.86 | `ComplexMemPool`
|         800 |      310,109,077.00 |                3.22 |    0.1% |1,149,134,246.00 |  984,620,812.00 |  1.167 |      3.41 | `ComplexMemPool`

|   coefficient |   err% | complexity
|--------------:|-------:|------------
|   4.78486e-07 |   4.5% | O(n^2)
|   6.38557e-10 |  21.7% | O(n^3)
|   3.42338e-05 |  38.0% | O(n log n)
|   0.000313914 |  46.9% | O(n)
|     0.0129823 | 114.4% | O(log n)
|     0.0815055 | 133.8% | O(1)

The best fitting curve is O(n^2), so the algorithm seems to scale
quadratic with `childTxs` in the range 25 to 800.
martinus added a commit to martinus/bitcoin that referenced this pull request Jun 13, 2020
This replaces the current benchmarking framework with nanobench [1], an
MIT licensed single-header benchmarking library, of which I am the
autor. This has in my opinion several advantages, especially on Linux:

* fast: Running all benchmarks takes ~6 seconds instead of 4m13s on
  an Intel i7-8700 CPU @ 3.20GHz.

* accurate: I ran e.g. the benchmark for SipHash_32b 10 times and
  calculate standard deviation / mean = coefficient of variation:

  * 0.57% CV for old benchmarking framework
  * 0.20% CV for nanobench

  So the benchmark results with nanobench seem to vary less than with
  the old framework.

* It automatically determines runtime based on clock precision, no need
  to specify number of evaluations.

* measure instructions, cycles, branches, instructions per cycle,
  branch misses (only Linux, when performance counters are available)

* output in markdown table format.

* Warn about unstable environment (frequency scaling, turbo, ...)

* For better profiling, it is possible to set the environment variable
  NANOBENCH_ENDLESS to force endless running of a particular benchmark
  without the need to recompile. This makes it to e.g. run "perf top"
  and look at hotspots.

Here is an example copy & pasted from the terminal output:

|             ns/byte |              byte/s |    err% |        ins/byte |        cyc/byte |    IPC |       bra/byte |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|                2.52 |      396,529,415.94 |    0.6% |           25.42 |            8.02 |  3.169 |           0.06 |    0.0% |      0.03 | `bench/crypto_hash.cpp RIPEMD160`
|                1.87 |      535,161,444.83 |    0.3% |           21.36 |            5.95 |  3.589 |           0.06 |    0.0% |      0.02 | `bench/crypto_hash.cpp SHA1`
|                3.22 |      310,344,174.79 |    1.1% |           36.80 |           10.22 |  3.601 |           0.09 |    0.0% |      0.04 | `bench/crypto_hash.cpp SHA256`
|                2.01 |      496,375,796.23 |    0.0% |           18.72 |            6.43 |  2.911 |           0.01 |    1.0% |      0.00 | `bench/crypto_hash.cpp SHA256D64_1024`
|                7.23 |      138,263,519.35 |    0.1% |           82.66 |           23.11 |  3.577 |           1.63 |    0.1% |      0.00 | `bench/crypto_hash.cpp SHA256_32b`
|                3.04 |      328,780,166.40 |    0.3% |           35.82 |            9.69 |  3.696 |           0.03 |    0.0% |      0.03 | `bench/crypto_hash.cpp SHA512`

[1] https://github.com/martinus/nanobench

* Adds support for asymptotes

  This adds support to calculate asymptotic complexity of a benchmark.
  This is similar to bitcoin#17375, but currently only one asymptote is
  supported, and I have added support in the benchmark `ComplexMemPool`
  as an example.

  Usage is e.g. like this:

  ```
  ./bench_bitcoin -filter=ComplexMemPool -asymptote=25,50,100,200,400,600,800
  ```

  This runs the benchmark `ComplexMemPool` several times but with
  different complexityN settings. The benchmark can extract that number
  and use it accordingly. Here, it's used for `childTxs`. The output is
  this:

  | complexityN |             
8000
  ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |     total | benchmark
  |------------:|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|----------:|:----------
  |          25 |        1,064,241.00 |              939.64 |    1.4% |    3,960,279.00 |    2,829,708.00 |  1.400 |      0.01 | `ComplexMemPool`
  |          50 |        1,579,530.00 |              633.10 |    1.0% |    6,231,810.00 |    4,412,674.00 |  1.412 |      0.02 | `ComplexMemPool`
  |         100 |        4,022,774.00 |              248.58 |    0.6% |   16,544,406.00 |   11,889,535.00 |  1.392 |      0.04 | `ComplexMemPool`
  |         200 |       15,390,986.00 |               64.97 |    0.2% |   63,904,254.00 |   47,731,705.00 |  1.339 |      0.17 | `ComplexMemPool`
  |         400 |       69,394,711.00 |               14.41 |    0.1% |  272,602,461.00 |  219,014,691.00 |  1.245 |      0.76 | `ComplexMemPool`
  |         600 |      168,977,165.00 |                5.92 |    0.1% |  639,108,082.00 |  535,316,887.00 |  1.194 |      1.86 | `ComplexMemPool`
  |         800 |      310,109,077.00 |                3.22 |    0.1% |1,149,134,246.00 |  984,620,812.00 |  1.167 |      3.41 | `ComplexMemPool`

  |   coefficient |   err% | complexity
  |--------------:|-------:|------------
  |   4.78486e-07 |   4.5% | O(n^2)
  |   6.38557e-10 |  21.7% | O(n^3)
  |   3.42338e-05 |  38.0% | O(n log n)
  |   0.000313914 |  46.9% | O(n)
  |     0.0129823 | 114.4% | O(log n)
  |     0.0815055 | 133.8% | O(1)

  The best fitting curve is O(n^2), so the algorithm seems to scale
  quadratic with `childTxs` in the range 25 to 800.
Warchant pushed a commit to Warchant/bitcoin that referenced this pull request Aug 6, 2020
This replaces the current benchmarking framework with nanobench [1], an
MIT licensed single-header benchmarking library, of which I am the
autor. This has in my opinion several advantages, especially on Linux:

* fast: Running all benchmarks takes ~6 seconds instead of 4m13s on
  an Intel i7-8700 CPU @ 3.20GHz.

* accurate: I ran e.g. the benchmark for SipHash_32b 10 times and
  calculate standard deviation / mean = coefficient of variation:

  * 0.57% CV for old benchmarking framework
  * 0.20% CV for nanobench

  So the benchmark results with nanobench seem to vary less than with
  the old framework.

* It automatically determines runtime based on clock precision, no need
  to specify number of evaluations.

* measure instructions, cycles, branches, instructions per cycle,
  branch misses (only Linux, when performance counters are available)

* output in markdown table format.

* Warn about unstable environment (frequency scaling, turbo, ...)

* For better profiling, it is possible to set the environment variable
  NANOBENCH_ENDLESS to force endless running of a particular benchmark
  without the need to recompile. This makes it to e.g. run "perf top"
  and look at hotspots.

Here is an example copy & pasted from the terminal output:

|             ns/byte |              byte/s |    err% |        ins/byte |        cyc/byte |    IPC |       bra/byte |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|                2.52 |      396,529,415.94 |    0.6% |           25.42 |            8.02 |  3.169 |           0.06 |    0.0% |      0.03 | `bench/crypto_hash.cpp RIPEMD160`
|                1.87 |      535,161,444.83 |    0.3% |           21.36 |            5.95 |  3.589 |           0.06 |    0.0% |      0.02 | `bench/crypto_hash.cpp SHA1`
|                3.22 |      310,344,174.79 |    1.1% |           36.80 |           10.22 |  3.601 |           0.09 |    0.0% |      0.04 | `bench/crypto_hash.cpp SHA256`
|                2.01 |      496,375,796.23 |    0.0% |           18.72 |            6.43 |  2.911 |           0.01 |    1.0% |      0.00 | `bench/crypto_hash.cpp SHA256D64_1024`
|                7.23 |      138,263,519.35 |    0.1% |           82.66 |           23.11 |  3.577 |           1.63 |    0.1% |      0.00 | `bench/crypto_hash.cpp SHA256_32b`
|                3.04 |      328,780,166.40 |    0.3% |           35.82 |            9.69 |  3.696 |           0.03 |    0.0% |      0.03 | `bench/crypto_hash.cpp SHA512`

[1] https://github.com/martinus/nanobench

* Adds support for asymptotes

  This adds support to calculate asymptotic complexity of a benchmark.
  This is similar to bitcoin#17375, but currently only one asymptote is
  supported, and I have added support in the benchmark `ComplexMemPool`
  as an example.

  Usage is e.g. like this:

  ```
  ./bench_bitcoin -filter=ComplexMemPool -asymptote=25,50,100,200,400,600,800
  ```

  This runs the benchmark `ComplexMemPool` several times but with
  different complexityN settings. The benchmark can extract that number
  and use it accordingly. Here, it's used for `childTxs`. The output is
  this:

  | complexityN |               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |     total | benchmark
  |------------:|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|----------:|:----------
  |          25 |        1,064,241.00 |              939.64 |    1.4% |    3,960,279.00 |    2,829,708.00 |  1.400 |      0.01 | `ComplexMemPool`
  |          50 |        1,579,530.00 |              633.10 |    1.0% |    6,231,810.00 |    4,412,674.00 |  1.412 |      0.02 | `ComplexMemPool`
  |         100 |        4,022,774.00 |              248.58 |    0.6% |   16,544,406.00 |   11,889,535.00 |  1.392 |      0.04 | `ComplexMemPool`
  |         200 |       15,390,986.00 |               64.97 |    0.2% |   63,904,254.00 |   47,731,705.00 |  1.339 |      0.17 | `ComplexMemPool`
  |         400 |       69,394,711.00 |               14.41 |    0.1% |  272,602,461.00 |  219,014,691.00 |  1.245 |      0.76 | `ComplexMemPool`
  |         600 |      168,977,165.00 |                5.92 |    0.1% |  639,108,082.00 |  535,316,887.00 |  1.194 |      1.86 | `ComplexMemPool`
  |         800 |      310,109,077.00 |                3.22 |    0.1% |1,149,134,246.00 |  984,620,812.00 |  1.167 |      3.41 | `ComplexMemPool`

  |   coefficient |   err% | complexity
  |--------------:|-------:|------------
  |   4.78486e-07 |   4.5% | O(n^2)
  |   6.38557e-10 |  21.7% | O(n^3)
  |   3.42338e-05 |  38.0% | O(n log n)
  |   0.000313914 |  46.9% | O(n)
  |     0.0129823 | 114.4% | O(log n)
  |     0.0815055 | 133.8% | O(1)

  The best fitting curve is O(n^2), so the algorithm seems to scale
  quadratic with `childTxs` in the range 25 to 800.
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request May 1, 2021
This replaces the current benchmarking framework with nanobench [1], an
MIT licensed single-header benchmarking library, of which I am the
autor. This has in my opinion several advantages, especially on Linux:

* fast: Running all benchmarks takes ~6 seconds instead of 4m13s on
  an Intel i7-8700 CPU @ 3.20GHz.

* accurate: I ran e.g. the benchmark for SipHash_32b 10 times and
  calculate standard deviation / mean = coefficient of variation:

  * 0.57% CV for old benchmarking framework
  * 0.20% CV for nanobench

  So the benchmark results with nanobench seem to vary less than with
  the old framework.

* It automatically determines runtime based on clock precision, no need
  to specify number of evaluations.

* measure instructions, cycles, branches, instructions per cycle,
  branch misses (only Linux, when performance counters are available)

* output in markdown table format.

* Warn about unstable environment (frequency scaling, turbo, ...)

* For better profiling, it is possible to set the environment variable
  NANOBENCH_ENDLESS to force endless running of a particular benchmark
  without the need to recompile. This makes it to e.g. run "perf top"
  and look at hotspots.

Here is an example copy & pasted from the terminal output:

|             ns/byte |              byte/s |    err% |        ins/byte |        cyc/byte |    IPC |       bra/byte |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|                2.52 |      396,529,415.94 |    0.6% |           25.42 |            8.02 |  3.169 |           0.06 |    0.0% |      0.03 | `bench/crypto_hash.cpp RIPEMD160`
|                1.87 |      535,161,444.83 |    0.3% |           21.36 |            5.95 |  3.589 |           0.06 |    0.0% |      0.02 | `bench/crypto_hash.cpp SHA1`
|                3.22 |      310,344,174.79 |    1.1% |           36.80 |           10.22 |  3.601 |           0.09 |    0.0% |      0.04 | `bench/crypto_hash.cpp SHA256`
|                2.01 |      496,375,796.23 |    0.0% |           18.72 |            6.43 |  2.911 |           0.01 |    1.0% |      0.00 | `bench/crypto_hash.cpp SHA256D64_1024`
|                7.23 |      138,263,519.35 |    0.1% |           82.66 |           23.11 |  3.577 |           1.63 |    0.1% |      0.00 | `bench/crypto_hash.cpp SHA256_32b`
|                3.04 |      328,780,166.40 |    0.3% |           35.82 |            9.69 |  3.696 |           0.03 |    0.0% |      0.03 | `bench/crypto_hash.cpp SHA512`

[1] https://github.com/martinus/nanobench

* Adds support for asymptotes

  This adds support to calculate asymptotic complexity of a benchmark.
  This is similar to bitcoin#17375, but currently only one asymptote is
  supported, and I have added support in the benchmark `ComplexMemPool`
  as an example.

  Usage is e.g. like this:

  ```
  ./bench_bitcoin -filter=ComplexMemPool -asymptote=25,50,100,200,400,600,800
  ```

  This runs the benchmark `ComplexMemPool` several times but with
  different complexityN settings. The benchmark can extract that number
  and use it accordingly. Here, it's used for `childTxs`. The output is
  this:

  | complexityN |               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |     total | benchmark
  |------------:|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|----------:|:----------
  |          25 |        1,064,241.00 |              939.64 |    1.4% |    3,960,279.00 |    2,829,708.00 |  1.400 |      0.01 | `ComplexMemPool`
  |          50 |        1,579,530.00 |              633.10 |    1.0% |    6,231,810.00 |    4,412,674.00 |  1.412 |      0.02 | `ComplexMemPool`
  |         100 |        4,022,774.00 |              248.58 |    0.6% |   16,544,406.00 |   11,889,535.00 |  1.392 |      0.04 | `ComplexMemPool`
  |         200 |       15,390,986.00 |               64.97 |    0.2% |   63,904,254.00 |   47,731,705.00 |  1.339 |      0.17 | `ComplexMemPool`
  |         400 |       69,394,711.00 |               14.41 |    0.1% |  272,602,461.00 |  219,014,691.00 |  1.245 |      0.76 | `ComplexMemPool`
  |         600 |      168,977,165.00 |                5.92 |    0.1% |  639,108,082.00 |  535,316,887.00 |  1.194 |      1.86 | `ComplexMemPool`
  |         800 |      310,109,077.00 |                3.22 |    0.1% |1,149,134,246.00 |  984,620,812.00 |  1.167 |      3.41 | `ComplexMemPool`

  |   coefficient |   err% | complexity
  |--------------:|-------:|------------
  |   4.78486e-07 |   4.5% | O(n^2)
  |   6.38557e-10 |  21.7% | O(n^3)
  |   3.42338e-05 |  38.0% | O(n log n)
  |   0.000313914 |  46.9% | O(n)
  |     0.0129823 | 114.4% | O(log n)
  |     0.0815055 | 133.8% | O(1)

  The best fitting curve is O(n^2), so the algorithm seems to scale
  quadratic with `childTxs` in the range 25 to 800.

# Conflicts:
#	.appveyor.yml
#	contrib/devtools/copyright_header.py
#	doc/benc
7656
hmarking.md
#	src/Makefile.bench.include
#	src/Makefile.test.include
#	src/bench/addrman.cpp
#	src/bench/base58.cpp
#	src/bench/bech32.cpp
#	src/bench/bench.cpp
#	src/bench/bench.h
#	src/bench/bench_bitcoin.cpp
#	src/bench/block_assemble.cpp
#	src/bench/ccoins_caching.cpp
#	src/bench/checkblock.cpp
#	src/bench/coin_selection.cpp
#	src/bench/crypto_hash.cpp
#	src/bench/duplicate_inputs.cpp
#	src/bench/examples.cpp
#	src/bench/gcs_filter.cpp
#	src/bench/hashpadding.cpp
#	src/bench/mempool_eviction.cpp
#	src/bench/mempool_stress.cpp
#	src/bench/prevector.cpp
#	src/bench/rollingbloom.cpp
#	src/bench/rpc_blockchain.cpp
#	src/bench/rpc_mempool.cpp
#	src/bench/verify_script.cpp
#	src/bench/wallet_balance.cpp
#	test/lint/lint-include-guards.sh
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0