8000 [Refactor, Tests] Move TestCudagraphs by vmoens · Pull Request #1007 · pytorch/tensordict · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

[Refactor, Tests] Move TestCudagraphs #1007

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 23, 2024
Merged

Conversation

vmoens
Copy link
Collaborator
@vmoens vmoens commented Sep 23, 2024

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]
vmoens pushed a commit that referenced this pull request Sep 23, 2024
ghstack-source-id: ee69a57
Pull Request resolved: #1007
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 23, 2024
@vmoens vmoens merged commit 150086b into gh/vmoens/19/base Sep 23, 2024
37 checks passed
vmoens pushed a commit that referenced this pull request Sep 23, 2024
ghstack-source-id: ee69a57
Pull Request resolved: #1007
@vmoens vmoens deleted the gh/vmoens/19/head branch September 23, 2024 14:57
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 222. Improved: $\large\color{#35bf28}22$. Worsened: $\large\color{#d91a1a}23$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 63.8190μs 19.4904μs 51.3073 KOps/s 47.5033 KOps/s $\textbf{\color{#35bf28}+8.01\%}$
test_plain_set_stack_nested 49.2920μs 19.7107μs 50.7339 KOps/s 47.0931 KOps/s $\textbf{\color{#35bf28}+7.73\%}$
test_plain_set_nested_inplace 47.5490μs 21.2260μs 47.1119 KOps/s 43.7852 KOps/s $\textbf{\color{#35bf28}+7.60\%}$
test_plain_set_stack_nested_inplace 53.0280μs 21.2354μs 47.0913 KOps/s 43.6097 KOps/s $\textbf{\color{#35bf28}+7.98\%}$
test_items 30.4660μs 4.3587μs 229.4285 KOps/s 242.4212 KOps/s $\textbf{\color{#d91a1a}-5.36\%}$
test_items_nested 0.6000ms 0.3823ms 2.6160 KOps/s 2.7374 KOps/s $\color{#d91a1a}-4.43\%$
test_items_nested_locked 0.6839ms 0.3722ms 2.6864 KOps/s 2.7207 KOps/s $\color{#d91a1a}-1.26\%$
test_items_nested_leaf 0.1248ms 69.0385μs 14.4847 KOps/s 13.8159 KOps/s $\color{#35bf28}+4.84\%$
test_items_stack_nested 0.6784ms 0.3751ms 2.6662 KOps/s 2.7241 KOps/s $\color{#d91a1a}-2.13\%$
test_items_stack_nested_leaf 0.1269ms 72.2468μs 13.8414 KOps/s 14.4701 KOps/s $\color{#d91a1a}-4.34\%$
test_items_stack_nested_locked 0.6605ms 0.3731ms 2.6803 KOps/s 2.7396 KOps/s $\color{#d91a1a}-2.16\%$
test_keys 19.8170μs 3.4953μs 286.1008 KOps/s 287.6555 KOps/s $\color{#d91a1a}-0.54\%$
test_keys_nested 0.2784ms 99.9998μs 10.0000 KOps/s 9.8645 KOps/s $\color{#35bf28}+1.37\%$
test_keys_nested_locked 1.6616ms 0.1099ms 9.0975 KOps/s 9.4794 KOps/s $\color{#d91a1a}-4.03\%$
test_keys_nested_leaf 0.1462ms 82.8358μs 12.0721 KOps/s 11.8545 KOps/s $\color{#35bf28}+1.84\%$
test_keys_stack_nested 0.1821ms 0.1019ms 9.8180 KOps/s 9.6784 KOps/s $\color{#35bf28}+1.44\%$
test_keys_stack_nested_leaf 0.1477ms 84.2713μs 11.8664 KOps/s 11.8571 KOps/s $\color{#35bf28}+0.08\%$
test_keys_stack_nested_locked 0.1825ms 0.1066ms 9.3849 KOps/s 9.2668 KOps/s $\color{#35bf28}+1.27\%$
test_values 7.8168μs 1.0445μs 957.3761 KOps/s 947.6831 KOps/s $\color{#35bf28}+1.02\%$
test_values_nested 0.1245ms 75.0978μs 13.3160 KOps/s 13.4880 KOps/s $\color{#d91a1a}-1.28\%$
test_values_nested_locked 0.1511ms 75.1563μs 13.3056 KOps/s 13.5490 KOps/s $\color{#d91a1a}-1.80\%$
test_values_nested_leaf 0.1075ms 62.2422μs 16.0663 KOps/s 15.9953 KOps/s $\color{#35bf28}+0.44\%$
test_values_stack_nested 0.1196ms 75.7592μs 13.1997 KOps/s 13.4541 KOps/s $\color{#d91a1a}-1.89\%$
test_values_stack_nested_leaf 0.1219ms 61.9558μs 16.1405 KOps/s 16.1805 KOps/s $\color{#d91a1a}-0.25\%$
test_values_stack_nested_locked 0.1318ms 76.7926μs 13.0221 KOps/s 13.0504 KOps/s $\color{#d91a1a}-0.22\%$
test_membership 4.2923μs 0.7729μs 1.2939 MOps/s 1.1750 MOps/s $\textbf{\color{#35bf28}+10.12\%}$
test_membership_nested 53.4800μs 2.8262μs 353.8292 KOps/s 362.3014 KOps/s $\color{#d91a1a}-2.34\%$
test_membership_nested_leaf 27.7310μs 2.8187μs 354.7680 KOps/s 359.3410 KOps/s $\color{#d91a1a}-1.27\%$
test_membership_stacked_nested 28.7040μs 2.7936μs 357.9649 KOps/s 361.2234 KOps/s $\color{#d91a1a}-0.90\%$
test_membership_stacked_nested_leaf 17.8730μs 2.8135μs 355.4313 KOps/s 360.6913 KOps/s $\color{#d91a1a}-1.46\%$
test_membership_nested_last 25.7780μs 4.0541μs 246.6609 KOps/s 248.4125 KOps/s $\color{#d91a1a}-0.71\%$
test_membership_nested_leaf_last 29.4650μs 4.0757μs 245.3583 KOps/s 249.9641 KOps/s $\color{#d91a1a}-1.84\%$
test_membership_stacked_nested_last 20.8690μs 4.6518μs 214.9703 KOps/s 245.8539 KOps/s $\textbf{\color{#d91a1a}-12.56\%}$
test_membership_stacked_nested_leaf_last 34.1830μs 4.6068μs 217.0693 KOps/s 249.0234 KOps/s $\textbf{\color{#d91a1a}-12.83\%}$
test_nested_getleaf 61.6580μs 10.8327μs 92.3129 KOps/s 89.3279 KOps/s $\color{#35bf28}+3.34\%$
test_nested_get 30.9080μs 10.2452μs 97.6062 KOps/s 95.0300 KOps/s $\color{#35bf28}+2.71\%$
test_stacked_getleaf 50.4940μs 10.7949μs 92.6367 KOps/s 90.8833 KOps/s $\color{#35bf28}+1.93\%$
test_stacked_get 53.6700μs 10.3428μs 96.6861 KOps/s 95.9723 KOps/s $\color{#35bf28}+0.74\%$
test_nested_getitemleaf 50.8220μs 10.9245μs 91.5377 KOps/s 87.7512 KOps/s $\color{#35bf28}+4.32\%$
test_nested_getitem 46.0460μs 10.4762μs 95.4547 KOps/s 93.4400 KOps/s $\color{#35bf28}+2.16\%$
test_stacked_getitemleaf 37.4500μs 11.1865μs 89.3932 KOps/s 87.3603 KOps/s $\color{#35bf28}+2.33\%$
test_stacked_getitem 45.4950μs 10.6766μs 93.6630 KOps/s 94.4440 KOps/s $\color{#d91a1a}-0.83\%$
test_lock_nested 94.9021ms 0.5975ms 1.6737 KOps/s 2.0121 KOps/s $\textbf{\color{#d91a1a}-16.82\%}$
test_lock_stack_nested 0.6233ms 0.4652ms 2.1495 KOps/s 2.1329 KOps/s $\color{#35bf28}+0.78\%$
test_unlock_nested 99.8934ms 0.5311ms 1.8828 KOps/s 2.2998 KOps/s $\textbf{\color{#d91a1a}-18.13\%}$
test_unlock_stack_nested 0.5005ms 0.3830ms 2.6110 KOps/s 2.5689 KOps/s $\color{#35bf28}+1.64\%$
test_flatten_speed 0.1573ms 90.1787μs 11.0891 KOps/s 11.3045 KOps/s $\color{#d91a1a}-1.91\%$
test_unflatten_speed 0.7608ms 0.4642ms 2.1544 KOps/s 2.1143 KOps/s $\color{#35bf28}+1.90\%$
test_common_ops 6.4290ms 1.1079ms 902.6472 Ops/s 853.1412 Ops/s $\textbf{\color{#35bf28}+5.80\%}$
test_creation 21.7810μs 2.1415μs 466.9525 KOps/s 484.7022 KOps/s $\color{#d91a1a}-3.66\%$
test_creation_empty 51.9670μs 16.0957μs 62.1283 KOps/s 53.3124 KOps/s $\textbf{\color{#35bf28}+16.54\%}$
test_creation_nested_1 89.4270μs 19.3007μs 51.8115 KOps/s 45.2083 KOps/s $\textbf{\color{#35bf28}+14.61\%}$
test_creation_nested_2 1.4103ms 24.0672μs 41.5503 KOps/s 38.1278 KOps/s $\textbf{\color{#35bf28}+8.98\%}$
test_clone 0.1373ms 18.1585μs 55.0707 KOps/s 55.7304 KOps/s $\color{#d91a1a}-1.18\%$
test_getitem[int] 0.8535ms 16.9605μs 58.9604 KOps/s 58.7351 KOps/s $\color{#35bf28}+0.38\%$
test_getitem[slice_int] 0.1610ms 31.6177μs 31.6279 KOps/s 32.3951 KOps/s $\color{#d91a1a}-2.37\%$
test_getitem[range] 0.3515ms 59.5601μs 16.7898 KOps/s 17.6614 KOps/s $\color{#d91a1a}-4.94\%$
test_getitem[tuple] 0.1337ms 25.3363μs 39.4690 KOps/s 39.0296 KOps/s $\color{#35bf28}+1.13\%$
test_getitem[list] 0.4696ms 55.1947μs 18.1177 KOps/s 19.0536 KOps/s $\color{#d91a1a}-4.91\%$
test_setitem_dim[int] 77.3250μs 33.7029μs 29.6710 KOps/s 30.2999 KOps/s $\color{#d91a1a}-2.08\%$
test_setitem_dim[slice_int] 0.1074ms 62.6542μs 15.9606 KOps/s 16.4111 KOps/s $\color{#d91a1a}-2.75\%$
test_setitem_dim[range] 0.1414ms 85.9333μs 11.6369 KOps/s 11.8814 KOps/s $\color{#d91a1a}-2.06\%$
test_setitem_dim[tuple] 96.0690μs 49.8545μs 20.0584 KOps/s 20.0451 KOps/s $\color{#35bf28}+0.07\%$
test_setitem 0.1725ms 30.0284μs 33.3018 KOps/s 33.1105 KOps/s $\color{#35bf28}+0.58\%$
test_set 0.1450ms 29.6147μs 33.7670 KOps/s 33.9503 KOps/s $\color{#d91a1a}-0.54\%$
test_set_shared 3.1777ms 0.2212ms 4.5217 KOps/s 4.5589 KOps/s $\color{#d91a1a}-0.82\%$
test_update 0.1838ms 36.4535μs 27.4322 KOps/s 27.0793 KOps/s $\color{#35bf28}+1.30\%$
test_update_nested 0.1991ms 46.7521μs 21.3894 KOps/s 20.9406 KOps/s $\color{#35bf28}+2.14\%$
test_update__nested 0.1705ms 37.6004μs 26.5955 KOps/s 28.1266 KOps/s $\textbf{\color{#d91a1a}-5.44\%}$
test_set_nested 0.1529ms 31.5099μs 31.7360 KOps/s 31.0192 KOps/s $\color{#35bf28}+2.31\%$
test_set_nested_new 1.1690ms 36.2964μs 27.5510 KOps/s 26.5510 KOps/s $\color{#35bf28}+3.77\%$
test_select 0.1920ms 53.5958μs 18.6582 KOps/s 18.4450 KOps/s $\color{#35bf28}+1.16\%$
test_select_nested 0.1372ms 62.2960μs 16.0524 KOps/s 16.9147 KOps/s $\textbf{\color{#d91a1a}-5.10\%}$
test_exclude_nested 0.1522ms 77.3484μs 12.9285 KOps/s 13.2198 KOps/s $\color{#d91a1a}-2.20\%$
test_empty[True] 0.4780ms 0.3253ms 3.0740 KOps/s 3.1602 KOps/s $\color{#d91a1a}-2.73\%$
test_empty[False] 11.2553μs 1.2706μs 787.0029 KOps/s 792.0396 KOps/s $\color{#d91a1a}-0.64\%$
test_unbind_speed 0.4217ms 0.3091ms 3.2352 KOps/s 3.3161 KOps/s $\color{#d91a1a}-2.44\%$
test_unbind_speed_stack0 0.4976ms 0.3035ms 3.2946 KOps/s 3.3505 KOps/s $\color{#d91a1a}-1.67\%$
test_unbind_speed_stack1 98.9143ms 0.8246ms 1.2127 KOps/s 1.3141 KOps/s $\textbf{\color{#d91a1a}-7.71\%}$
test_split 95.2333ms 2.2164ms 451.1888 Ops/s 456.6632 Ops/s $\color{#d91a1a}-1.20\%$
test_chunk 2.4728ms 2.0277ms 493.1800 Ops/s 455.8895 Ops/s $\textbf{\color{#35bf28}+8.18\%}$
test_creation[device0] 3.8312ms 0.1212ms 8.2535 KOps/s 8.3524 KOps/s $\color{#d91a1a}-1.18\%$
test_creation_from_tensor 0.2779ms 0.1173ms 8.5239 KOps/s 8.3701 KOps/s $\color{#35bf28}+1.84\%$
test_add_one[memmap_tensor0] 0.3640ms 7.5479μs 132.4868 KOps/s 134.6510 KOps/s $\color{#d91a1a}-1.61\%$
test_contiguous[memmap_tensor0] 21.9910μs 1.9203μs 520.7402 KOps/s 517.4948 KOps/s $\color{#35bf28}+0.63\%$
test_stack[memmap_tensor0] 37.0590μs 5.8022μs 172.3474 KOps/s 169.7192 KOps/s $\color{#35bf28}+1.55\%$
test_memmaptd_index 1.1699ms 0.4045ms 2.4719 KOps/s 2.4698 KOps/s $\color{#35bf28}+0.08\%$
test_memmaptd_index_astensor 0.8284ms 0.4849ms 2.0622 KOps/s 2.0648 KOps/s $\color{#d91a1a}-0.13\%$
test_memmaptd_index_op 1.6524ms 1.0182ms 982.1419 Ops/s 959.9535 Ops/s $\color{#35bf28}+2.31\%$
test_serialize_model 0.2172s 0.1326s 7.5409 Ops/s 8.2080 Ops/s $\textbf{\color{#d91a1a}-8.13\%}$
test_serialize_model_pickle 0.4863s 0.3932s 2.5434 Ops/s 2.5155 Ops/s $\color{#35bf28}+1.11\%$
test_serialize_weights 0.1259s 0.1174s 8.5189 Ops/s 7.5031 Ops/s $\textbf{\color{#35bf28}+13.54\%}$
test_serialize_weights_returnearly 0.2732s 0.1747s 5.7229 Ops/s 6.3647 Ops/s $\textbf{\color{#d91a1a}-10.08\%}$
test_serialize_weights_pickle 0.4476s 0.4079s 2.4513 Ops/s 2.2782 Ops/s $\textbf{\color{#35bf28}+7.60\%}$
test_serialize_weights_filesystem 0.1477s 0.1420s 7.0399 Ops/s 6.7694 Ops/s $\color{#35bf28}+4.00\%$
test_serialize_model_filesystem 0.1524s 0.1490s 6.7117 Ops/s 6.5201 Ops/s $\color{#35bf28}+2.94\%$
test_reshape_pytree 0.1031ms 40.9380μs 24.4272 KOps/s 25.8865 KOps/s $\textbf{\color{#d91a1a}-5.64\%}$
test_reshape_td 0.1159ms 47.2833μs 21.1491 KOps/s 21.2031 KOps/s $\color{#d91a1a}-0.25\%$
test_view_pytree 0.1088ms 39.9666μs 25.0209 KOps/s 25.8430 KOps/s $\color{#d91a1a}-3.18\%$
test_view_td 0.1296ms 52.7376μs 18.9618 KOps/s 18.9571 KOps/s $\color{#35bf28}+0.02\%$
test_unbind_pytree 83.1350μs 37.0858μs 26.9645 KOps/s 27.4988 KOps/s $\color{#d91a1a}-1.94\%$
test_unbind_td 0.3157ms 46.7113μs 21.4081 KOps/s 21.8677 KOps/s $\color{#d91a1a}-2.10\%$
test_split_pytree 94.3670μs 38.4858μs 25.9836 KOps/s 26.4971 KOps/s $\color{#d91a1a}-1.94\%$
test_split_td 0.2096ms 58.7640μs 17.0172 KOps/s 17.1089 KOps/s $\color{#d91a1a}-0.54\%$
test_add_pytree 0.1137ms 47.0857μs 21.2379 KOps/s 22.0154 KOps/s $\color{#d91a1a}-3.53\%$
test_add_td 0.1707ms 80.1465μs 12.4772 KOps/s 11.8905 KOps/s $\color{#35bf28}+4.93\%$
test_compile_add_one_nested[tensordict-compile] 0.1151ms 60.9838μs 16.3978 KOps/s 17.2419 KOps/s $\color{#d91a1a}-4.90\%$
test_compile_add_one_nested[tensordict-eager] 0.3349ms 0.1807ms 5.5329 KOps/s 5.5793 KOps/s $\color{#d91a1a}-0.83\%$
test_compile_add_one_nested[pytree-compile] 0.1253ms 59.7733μs 16.7299 KOps/s 17.3523 KOps/s $\color{#d91a1a}-3.59\%$
test_compile_add_one_nested[pytree-eager] 0.2957ms 0.1464ms 6.8323 KOps/s 7.1741 KOps/s $\color{#d91a1a}-4.76\%$
test_compile_copy_nested[tensordict-compile] 55.0730μs 21.3561μs 46.8250 KOps/s 47.0072 KOps/s $\color{#d91a1a}-0.39\%$
test_compile_copy_nested[tensordict-eager] 0.1393ms 68.1817μs 14.6667 KOps/s 14.8423 KOps/s $\color{#d91a1a}-1.18\%$
test_compile_copy_nested[pytree-compile] 0.1519ms 77.5262μs 12.8989 KOps/s 13.1368 KOps/s $\color{#d91a1a}-1.81\%$
test_compile_copy_nested[pytree-eager] 0.1392ms 70.5915μs 14.1660 KOps/s 14.5559 KOps/s $\color{#d91a1a}-2.68\%$
test_compile_add_one_flat[tensordict-compile] 0.2814ms 0.1763ms 5.6713 KOps/s 5.6692 KOps/s $\color{#35bf28}+0.04\%$
test_compile_add_one_flat[tensordict-eager] 0.3281ms 0.1945ms 5.1403 KOps/s 5.2052 KOps/s $\color{#d91a1a}-1.25\%$
test_compile_add_one_flat[tensorclass-compile] 0.1046ms 50.0738μs 19.9705 KOps/s 21.5710 KOps/s $\textbf{\color{#d91a1a}-7.42\%}$
test_compile_add_one_flat[tensorclass-eager] 0.3614ms 70.5395μs 14.1765 KOps/s 14.5461 KOps/s $\color{#d91a1a}-2.54\%$
test_compile_add_one_flat[pytree-compile] 0.3501ms 0.1814ms 5.5138 KOps/s 5.7205 KOps/s $\color{#d91a1a}-3.61\%$
test_compile_add_one_flat[pytree-eager] 0.5285ms 0.2922ms 3.4221 KOps/s 3.5462 KOps/s $\color{#d91a1a}-3.50\%$
test_compile_add_self_flat[tensordict-eager] 0.3948ms 0.2098ms 4.7662 KOps/s 4.9113 KOps/s $\color{#d91a1a}-2.95\%$
test_compile_add_self_flat[tensordict-compile] 0.7351ms 0.1842ms 5.4278 KOps/s 5.6748 KOps/s $\color{#d91a1a}-4.35\%$
test_compile_add_self_flat[tensorclass-eager] 0.1307ms 63.1709μs 15.8301 KOps/s 16.0036 KOps/s $\color{#d91a1a}-1.08\%$
test_compile_add_self_flat[tensorclass-compile] 0.1027ms 49.8235μs 20.0708 KOps/s 21.0259 KOps/s $\color{#d91a1a}-4.54\%$
test_compile_add_self_flat[pytree-eager] 0.4012ms 0.2358ms 4.2405 KOps/s 4.3301 KOps/s $\color{#d91a1a}-2.07\%$
test_compile_add_self_flat[pytree-compile] 0.2652ms 0.1784ms 5.6051 KOps/s 5.6733 KOps/s $\color{#d91a1a}-1.20\%$
test_compile_copy_flat[tensordict-compile] 0.1976ms 0.1039ms 9.6256 KOps/s 9.6315 KOps/s $\color{#d91a1a}-0.06\%$
test_compile_copy_flat[tensordict-eager] 0.1328ms 62.1894μs 16.0799 KOps/s 17.5769 KOps/s $\textbf{\color{#d91a1a}-8.52\%}$
test_compile_copy_flat[pytree-compile] 0.1808ms 80.4238μs 12.4341 KOps/s 13.1055 KOps/s $\textbf{\color{#d91a1a}-5.12\%}$
test_compile_copy_flat[pytree-eager] 0.1385ms 72.5828μs 13.7774 8000 KOps/s 14.5624 KOps/s $\textbf{\color{#d91a1a}-5.39\%}$
test_compile_assign_and_add[tensordict-compile] 0.3683ms 0.2007ms 4.9815 KOps/s 5.0887 KOps/s $\color{#d91a1a}-2.11\%$
test_compile_assign_and_add[tensordict-eager] 2.7981ms 1.7200ms 581.4095 Ops/s 587.0132 Ops/s $\color{#d91a1a}-0.95\%$
test_compile_assign_and_add[pytree-compile] 0.3567ms 0.1994ms 5.0163 KOps/s 5.1104 KOps/s $\color{#d91a1a}-1.84\%$
test_compile_assign_and_add[pytree-eager] 2.0119ms 1.1266ms 887.6043 Ops/s 910.6509 Ops/s $\color{#d91a1a}-2.53\%$
test_compile_assign_and_add_stack[compile] 0.7280ms 0.4320ms 2.3147 KOps/s 2.3291 KOps/s $\color{#d91a1a}-0.62\%$
test_compile_assign_and_add_stack[eager] 4.1028ms 3.7060ms 269.8359 Ops/s 261.5845 Ops/s $\color{#35bf28}+3.15\%$
test_compile_indexing[tensor-tensordict-compile] 86.8220μs 37.0049μs 27.0235 KOps/s 27.0620 KOps/s $\color{#d91a1a}-0.14\%$
test_compile_indexing[tensor-tensordict-eager] 0.6974ms 49.8391μs 20.0646 KOps/s 20.2315 KOps/s $\color{#d91a1a}-0.83\%$
test_compile_indexing[tensor-tensorclass-compile] 96.8410μs 31.8620μs 31.3854 KOps/s 33.8090 KOps/s $\textbf{\color{#d91a1a}-7.17\%}$
test_compile_indexing[tensor-tensorclass-eager] 84.6980μs 30.8363μs 32.4293 KOps/s 34.2356 KOps/s $\textbf{\color{#d91a1a}-5.28\%}$
test_compile_indexing[tensor-pytree-compile] 82.8250μs 31.7886μs 31.4578 KOps/s 33.4391 KOps/s $\textbf{\color{#d91a1a}-5.93\%}$
test_compile_indexing[tensor-pytree-eager] 85.1890μs 30.4007μs 32.8939 KOps/s 32.3318 KOps/s $\color{#35bf28}+1.74\%$
test_compile_indexing[slice-tensordict-compile] 0.2219ms 78.9768μs 12.6620 KOps/s 13.1645 KOps/s $\color{#d91a1a}-3.82\%$
test_compile_indexing[slice-tensordict-eager] 0.6007ms 28.0413μs 35.6616 KOps/s 36.0540 KOps/s $\color{#d91a1a}-1.09\%$
test_compile_indexing[slice-tensorclass-compile] 0.1417ms 71.4221μs 14.0013 KOps/s 14.5574 KOps/s $\color{#d91a1a}-3.82\%$
test_compile_indexing[slice-tensorclass-eager] 73.2570μs 24.1314μs 41.4398 KOps/s 43.8709 KOps/s $\textbf{\color{#d91a1a}-5.54\%}$
test_compile_indexing[slice-pytree-compile] 0.1679ms 71.1865μs 14.0476 KOps/s 14.5477 KOps/s $\color{#d91a1a}-3.44\%$
test_compile_indexing[slice-pytree-eager] 81.0410μs 23.5817μs 42.4059 KOps/s 44.1074 KOps/s $\color{#d91a1a}-3.86\%$
test_compile_indexing[int-tensordict-compile] 2.0120ms 76.8207μs 13.0173 KOps/s 13.5417 KOps/s $\color{#d91a1a}-3.87\%$
test_compile_indexing[int-tensordict-eager] 1.0095ms 28.3427μs 35.2824 KOps/s 36.9218 KOps/s $\color{#d91a1a}-4.44\%$
test_compile_indexing[int-tensorclass-compile] 0.1947ms 70.5427μs 14.1758 KOps/s 14.7440 KOps/s $\color{#d91a1a}-3.85\%$
test_compile_indexing[int-tensorclass-eager] 77.7860μs 23.5438μs 42.4741 KOps/s 44.5894 KOps/s $\color{#d91a1a}-4.74\%$
test_compile_indexing[int-pytree-compile] 0.4742ms 75.1176μs 13.3125 KOps/s 14.8014 KOps/s $\textbf{\color{#d91a1a}-10.06\%}$
test_compile_indexing[int-pytree-eager] 66.2640μs 23.4544μs 42.6359 KOps/s 44.5908 KOps/s $\color{#d91a1a}-4.38\%$
test_mod_add[eager] 70.1910μs 23.9264μs 41.7949 KOps/s 39.5865 KOps/s $\textbf{\color{#35bf28}+5.58\%}$
test_mod_add[compile] 89.0170μs 42.6619μs 23.4401 KOps/s 25.6930 KOps/s $\textbf{\color{#d91a1a}-8.77\%}$
test_mod_add[compile-overhead] 0.1564ms 40.9841μs 24.3997 KOps/s 25.5666 KOps/s $\color{#d91a1a}-4.56\%$
test_mod_wrap[eager] 0.3900ms 0.2105ms 4.7512 KOps/s 4.8370 KOps/s $\color{#d91a1a}-1.77\%$
test_mod_wrap[compile] 0.4584ms 0.2431ms 4.1141 KOps/s 4.2735 KOps/s $\color{#d91a1a}-3.73\%$
test_mod_wrap[compile-overhead] 0.4755ms 0.2385ms 4.1933 KOps/s 4.2870 KOps/s $\color{#d91a1a}-2.19\%$
test_mod_wrap_and_backward[eager] 12.6853ms 10.9498ms 91.3262 Ops/s 89.4736 Ops/s $\color{#35bf28}+2.07\%$
test_mod_wrap_and_backward[compile] 12.3971ms 10.9847ms 91.0356 Ops/s 84.1198 Ops/s $\textbf{\color{#35bf28}+8.22\%}$
test_mod_wrap_and_backward[compile-overhead] 12.2231ms 10.8661ms 92.0289 Ops/s 82.5283 Ops/s $\textbf{\color{#35bf28}+11.51\%}$
test_seq_add[eager] 0.1616ms 86.1266μs 11.6108 KOps/s 10.7817 KOps/s $\textbf{\color{#35bf28}+7.69\%}$
test_seq_add[compile] 0.1382ms 67.5413μs 14.8058 KOps/s 14.8867 KOps/s $\color{#d91a1a}-0.54\%$
test_seq_add[compile-overhead] 0.1247ms 66.8899μs 14.9499 KOps/s 15.3090 KOps/s $\color{#d91a1a}-2.35\%$
test_seq_wrap[eager] 0.6898ms 0.3911ms 2.5571 KOps/s 2.5031 KOps/s $\color{#35bf28}+2.16\%$
test_seq_wrap[compile] 1.2881ms 0.2752ms 3.6339 KOps/s 3.6061 KOps/s $\color{#35bf28}+0.77\%$
test_seq_wrap[compile-overhead] 1.2576ms 0.2747ms 3.6399 KOps/s 3.5828 KOps/s $\color{#35bf28}+1.59\%$
test_func_call_runtime[False-eager] 0.9418ms 0.5393ms 1.8543 KOps/s 1.8474 KOps/s $\color{#35bf28}+0.37\%$
test_func_call_runtime[False-compile] 0.8838ms 0.5133ms 1.9483 KOps/s 1.9600 KOps/s $\color{#d91a1a}-0.60\%$
test_func_call_runtime[False-compile-overhead] 0.7513ms 0.5142ms 1.9446 KOps/s 1.9580 KOps/s $\color{#d91a1a}-0.68\%$
test_func_call_runtime[True-eager] 1.2411ms 0.7611ms 1.3139 KOps/s 1.3331 KOps/s $\color{#d91a1a}-1.44\%$
test_func_call_runtime[True-compile] 0.6645ms 0.5239ms 1.9089 KOps/s 1.9255 KOps/s $\color{#d91a1a}-0.86\%$
test_func_call_runtime[True-compile-overhead] 1.0350ms 0.5258ms 1.9018 KOps/s 1.9313 KOps/s $\color{#d91a1a}-1.53\%$
test_func_call_cm_runtime[False-eager] 1.0464ms 0.5332ms 1.8753 KOps/s 1.9347 KOps/s $\color{#d91a1a}-3.07\%$
test_func_call_cm_runtime[False-compile] 0.7277ms 0.5192ms 1.9259 KOps/s 1.9497 KOps/s $\color{#d91a1a}-1.22\%$
test_func_call_cm_runtime[False-compile-overhead] 0.8632ms 0.5188ms 1.9275 KOps/s 1.9513 KOps/s $\color{#d91a1a}-1.22\%$
test_func_call_cm_runtime[True-eager] 1.8262ms 0.8896ms 1.1241 KOps/s 1.1357 KOps/s $\color{#d91a1a}-1.02\%$
test_func_call_cm_runtime[True-compile] 1.1855ms 0.7508ms 1.3320 KOps/s 1.3300 KOps/s $\color{#35bf28}+0.15\%$
test_func_call_cm_runtime[True-compile-overhead] 0.9286ms 0.7489ms 1.3354 KOps/s 1.3330 KOps/s $\color{#35bf28}+0.18\%$
test_vmap_func_call_cm_runtime[eager] 2.6371ms 1.8744ms 533.5147 Ops/s 527.5733 Ops/s $\color{#35bf28}+1.13\%$
test_vmap_func_call_cm_runtime[compile] 3.2500ms 1.9512ms 512.4963 Ops/s 493.0278 Ops/s $\color{#35bf28}+3.95\%$
test_vmap_func_call_cm_runtime[compile-overhead] 3.1985ms 1.9450ms 514.1470 Ops/s 505.6207 Ops/s $\color{#35bf28}+1.69\%$
test_distributed 1.1597ms 0.1253ms 7.9812 KOps/s 7.6873 KOps/s $\color{#35bf28}+3.82\%$
test_tdmodule 46.5270μs 16.7263μs 59.7861 KOps/s 51.8821 KOps/s $\textbf{\color{#35bf28}+15.23\%}$
test_tdmodule_dispatch 59.9820μs 33.4933μs 29.8567 KOps/s 26.2315 KOps/s $\textbf{\color{#35bf28}+13.82\%}$
test_tdseq 37.1990μs 18.8906μs 52.9364 KOps/s 44.8488 KOps/s $\textbf{\color{#35bf28}+18.03\%}$
test_tdseq_dispatch 70.8820μs 37.2046μs 26.8784 KOps/s 22.9973 KOps/s $\textbf{\color{#35bf28}+16.88\%}$
test_instantiation_functorch 2.6201ms 1.6505ms 605.8828 Ops/s 623.6273 Ops/s $\color{#d91a1a}-2.85\%$
test_instantiation_td 2.1251ms 1.2055ms 829.5473 Ops/s 834.3582 Ops/s $\color{#d91a1a}-0.58\%$
test_exec_functorch 0.3228ms 0.1916ms 5.2185 KOps/s 5.3174 KOps/s $\color{#d91a1a}-1.86\%$
test_exec_functional_call 0.4413ms 0.1804ms 5.5433 KOps/s 5.6510 KOps/s $\color{#d91a1a}-1.91\%$
test_exec_td 0.2772ms 0.1796ms 5.5673 KOps/s 5.7751 KOps/s $\color{#d91a1a}-3.60\%$
test_exec_td_decorator 1.0623ms 0.2322ms 4.3069 KOps/s 4.3663 KOps/s $\color{#d91a1a}-1.36\%$
test_vmap_mlp_speed[True-True] 0.9457ms 0.6553ms 1.5261 KOps/s 1.5112 KOps/s $\color{#35bf28}+0.98\%$
test_vmap_mlp_speed[True-False] 1.0761ms 0.6459ms 1.5481 KOps/s 1.5251 KOps/s $\color{#35bf28}+1.51\%$
test_vmap_mlp_speed[False-True] 0.7702ms 0.5105ms 1.9591 KOps/s 1.9985 KOps/s $\color{#d91a1a}-1.98\%$
test_vmap_mlp_speed[False-False] 0.7345ms 0.5100ms 1.9609 KOps/s 1.9995 KOps/s $\color{#d91a1a}-1.93\%$
test_vmap_mlp_speed_decorator[True-True] 1.4718ms 0.6287ms 1.5906 KOps/s 1.5966 KOps/s $\color{#d91a1a}-0.37\%$
test_vmap_mlp_speed_decorator[True-False] 0.8379ms 0.6245ms 1.6013 KOps/s 1.5793 KOps/s $\color{#35bf28}+1.39\%$
test_vmap_mlp_speed_decorator[False-True] 0.7944ms 0.5262ms 1.9004 KOps/s 1.9491 KOps/s $\color{#d91a1a}-2.50\%$
test_vmap_mlp_speed_decorator[False-False] 0.8330ms 0.5236ms 1.9099 KOps/s 1.9545 KOps/s $\color{#d91a1a}-2.28\%$
test_to_module_speed[True] 1.9563ms 1.3699ms 729.9865 Ops/s 763.5646 Ops/s $\color{#d91a1a}-4.40\%$
test_to_module_speed[False] 2.0766ms 1.3250ms 754.7081 Ops/s 786.5195 Ops/s $\color{#d91a1a}-4.04\%$
test_tc_init 91.6510μs 43.4126μs 23.0348 KOps/s 21.4413 KOps/s $\textbf{\color{#35bf28}+7.43\%}$
test_tc_init_nested 0.1528ms 87.7558μs 11.3953 KOps/s 10.7619 KOps/s $\textbf{\color{#35bf28}+5.88\%}$
test_tc_first_layer_tensor 28.2930μs 1.5532μs 643.8201 KOps/s 620.4163 KOps/s $\color{#35bf28}+3.77\%$
test_tc_first_layer_nontensor 23.1130μs 4.7589μs 210.1334 KOps/s 205.6048 KOps/s $\color{#35bf28}+2.20\%$
test_tc_second_layer_tensor 29.0840μs 2.8610μs 349.5253 KOps/s 342.2075 KOps/s $\color{#35bf28}+2.14\%$
test_tc_second_layer_nontensor 47.2680μs 6.2161μs 160.8727 KOps/s 157.7740 KOps/s $\color{#35bf28}+1.96\%$
test_unbind 0.5017s 13.9953ms 71.4526 Ops/s 72.5946 Ops/s $\color{#d91a1a}-1.57\%$
test_full_like 10.2575ms 8.3171ms 120.2340 Ops/s 118.2185 Ops/s $\color{#35bf28}+1.70\%$
test_zeros_like 4.6730ms 3.1792ms 314.5481 Ops/s 306.5735 Ops/s $\color{#35bf28}+2.60\%$
test_ones_like 12.9689ms 6.4191ms 155.7853 Ops/s 262.9863 Ops/s $\textbf{\color{#d91a1a}-40.76\%}$
test_clone 12.5125ms 8.2802ms 120.7704 Ops/s 180.3729 Ops/s $\textbf{\color{#d91a1a}-33.04\%}$
test_squeeze 76.4630μs 13.1824μs 75.8589 KOps/s 79.0838 KOps/s $\color{#d91a1a}-4.08\%$
test_unsqueeze 0.3910ms 93.5941μs 10.6844 KOps/s 10.6960 KOps/s $\color{#d91a1a}-0.11\%$
test_split 0.3486ms 0.1967ms 5.0838 KOps/s 5.0476 KOps/s $\color{#35bf28}+0.72\%$
test_permute 0.4509ms 0.2233ms 4.4788 KOps/s 4.4700 KOps/s $\color{#35bf28}+0.20\%$
test_stack 31.3649ms 26.1552ms 38.2333 Ops/s 38.1525 Ops/s $\color{#35bf28}+0.21\%$
test_cat 32.5375ms 25.7611ms 38.8183 Ops/s 39.6790 Ops/s $\color{#d91a1a}-2.17\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 228. Improved: $\large\color{#35bf28}13$. Worsened: $\large\color{#d91a1a}23$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.1367ms 14.3323μs 69.7723 KOps/s 73.7101 KOps/s $\textbf{\color{#d91a1a}-5.34\%}$
test_plain_set_stack_nested 41.6020μs 14.3136μs 69.8635 KOps/s 72.8874 KOps/s $\color{#d91a1a}-4.15\%$
test_plain_set_nested_inplace 50.3830μs 15.3191μs 65.2780 KOps/s 67.5550 KOps/s $\color{#d91a1a}-3.37\%$
test_plain_set_stack_nested_inplace 48.9820μs 15.2545μs 65.5546 KOps/s 67.2822 KOps/s $\color{#d91a1a}-2.57\%$
test_items 27.6620μs 2.9017μs 344.6255 KOps/s 344.6753 KOps/s $\color{#d91a1a}-0.01\%$
test_items_nested 0.5585ms 0.3253ms 3.0745 KOps/s 3.0933 KOps/s $\color{#d91a1a}-0.61\%$
test_items_nested_locked 0.3998ms 0.3303ms 3.0274 KOps/s 3.0524 KOps/s $\color{#d91a1a}-0.82\%$
test_items_nested_leaf 88.9140μs 55.5803μs 17.9920 KOps/s 17.9423 KOps/s $\color{#35bf28}+0.28\%$
test_items_stack_nested 0.3758ms 0.3275ms 3.0533 KOps/s 3.0145 KOps/s $\color{#35bf28}+1.29\%$
test_items_stack_nested_leaf 0.1051ms 56.0864μs 17.8296 KOps/s 17.8187 KOps/s $\color{#35bf28}+0.06\%$
test_items_stack_nested_locked 0.3545ms 0.3302ms 3.0286 KOps/s 3.0144 KOps/s $\color{#35bf28}+0.47\%$
test_keys 28.6520μs 3.4138μs 292.9301 KOps/s 291.9753 KOps/s $\color{#35bf28}+0.33\%$
test_keys_nested 96.6250μs 57.2704μs 17.4610 KOps/s 17.7975 KOps/s $\color{#d91a1a}-1.89\%$
test_keys_nested_locked 2.6074ms 62.6674μs 15.9573 KOps/s 16.1258 KOps/s $\color{#d91a1a}-1.05\%$
test_keys_nested_leaf 85.5840μs 48.1921μs 20.7503 KOps/s 21.0377 KOps/s $\color{#d91a1a}-1.37\%$
test_keys_stack_nested 82.0140μs 56.6082μs 17.6653 KOps/s 17.7423 KOps/s $\color{#d91a1a}-0.43\%$
test_keys_stack_nested_leaf 0.2360ms 48.6190μs 20.5681 KOps/s 21.0366 KOps/s $\color{#d91a1a}-2.23\%$
test_keys_stack_nested_locked 0.2334ms 61.5710μs 16.2414 KOps/s 16.2043 KOps/s $\color{#35bf28}+0.23\%$
test_values 27.7632μs 0.8407μs 1.1894 MOps/s 1.1999 MOps/s $\color{#d91a1a}-0.87\%$
test_values_nested 92.1050μs 40.9474μs 24.4216 KOps/s 24.4464 KOps/s $\color{#d91a1a}-0.10\%$
test_values_nested_locked 67.4530μs 42.8449μs 23.3400 KOps/s 23.2846 KOps/s $\color{#35bf28}+0.24\%$
test_values_nested_leaf 62.2930μs 35.7479μs 27.9737 KOps/s 28.1360 KOps/s $\color{#d91a1a}-0.58\%$
test_values_stack_nested 68.4630μs 41.8265μs 23.9083 KOps/s 24.3669 KOps/s $\color{#d91a1a}-1.88\%$
test_values_stack_nested_leaf 61.3430μs 35.6692μs 28.0354 KOps/s 27.9464 KOps/s $\color{#35bf28}+0.32\%$
test_values_stack_nested_locked 71.9730μs 43.4478μs 23.0161 KOps/s 23.2215 KOps/s $\color{#d91a1a}-0.88\%$
test_membership 1.6596μs 0.5032μs 1.9871 MOps/s 1.9495 MOps/s $\color{#35bf28}+1.93\%$
test_membership_nested 29.5520μs 1.9524μs 512.1795 KOps/s 526.7942 KOps/s $\color{#d91a1a}-2.77\%$
test_membership_nested_leaf 12.5073μs 1.8892μs 529.3118 KOps/s 539.0995 KOps/s $\color{#d91a1a}-1.82\%$
test_membership_stacked_nested 27.7420μs 1.9290μs 518.3924 KOps/s 524.6367 KOps/s $\color{#d91a1a}-1.19\%$
test_membership_stacked_nested_leaf 26.0410μs 1.9055μs 524.7873 KOps/s 524.7709 KOps/s $+0.00\%$
test_membership_nested_last 35.9610μs 2.8205μs 354.5418 KOps/s 359.2920 KOps/s $\color{#d91a1a}-1.32\%$
test_membership_nested_leaf_last 26.6220μs 2.7888μs 358.5785 KOps/s 358.2296 KOps/s $\color{#35bf28}+0.10\%$
test_membership_stacked_nested_last 34.1120μs 7.8721μs 127.0302 KOps/s 359.5462 KOps/s $\textbf{\color{#d91a1a}-64.67\%}$
test_membership_stacked_nested_leaf_last 52.6120μs 7.7611μs 128.8472 KOps/s 360.8750 KOps/s $\textbf{\color{#d91a1a}-64.30\%}$
test_nested_getleaf 30.9320μs 6.0492μs 165.3114 KOps/s 166.7192 KOps/s $\color{#d91a1a}-0.84\%$
test_nested_get 29.4620μs 5.7588μs 173.6471 KOps/s 172.9440 KOps/s $\color{#35bf28}+0.41\%$
test_stacked_getleaf 28.6220μs 6.0562μs 165.1195 KOps/s 165.4131 KOps/s $\color{#d91a1a}-0.18\%$
test_stacked_get 33.5610μs 5.6539μs 176.8682 KOps/s 175.9810 KOps/s $\color{#35bf28}+0.50\%$
test_nested_getitemleaf 28.6020μs 6.1325μs 163.0655 KOps/s 161.7126 KOps/s $\color{#35bf28}+0.84\%$
test_nested_getitem 0.1730ms 5.7476μs 173.9853 KOps/s 172.2431 KOps/s $\color{#35bf28}+1.01\%$
test_stacked_getitemleaf 44.2030μs 6.1176μs 163.4641 KOps/s 163.0141 KOps/s $\color{#35bf28}+0.28\%$
test_stacked_getitem 47.3630μs 5.6836μs 175.9439 KOps/s 172.7140 KOps/s $\color{#35bf28}+1.87\%$
test_lock_nested 4.5465ms 0.4140ms 2.4155 KOps/s 2.3954 KOps/s $\color{#35bf28}+0.84\%$
test_lock_stack_nested 0.4688ms 0.3690ms 2.7103 KOps/s 2.6272 KOps/s $\color{#35bf28}+3.17\%$
test_unlock_nested 0.7378ms 0.3494ms 2.8621 KOps/s 2.8214 KOps/s $\color{#35bf28}+1.44\%$
test_unlock_stack_nested 0.4409ms 0.3085ms 3.2416 KOps/s 3.1134 KOps/s $\color{#35bf28}+4.12\%$
test_flatten_speed 0.1455ms 69.2485μs 14.4407 KOps/s 14.4915 KOps/s $\color{#d91a1a}-0.35\%$
test_unflatten_speed 0.4074ms 0.2857ms 3.5004 KOps/s 3.4927 KOps/s $\color{#35bf28}+0.22\%$
test_common_ops 1.4671ms 1.2374ms 808.1774 Ops/s 836.5809 Ops/s $\color{#d91a1a}-3.40\%$
test_creation 30.1810μs 1.5071μs 663.5194 KOps/s 661.8972 KOps/s $\color{#35bf28}+0.25\%$
test_creation_empty 42.2120μs 16.1589μs 61.8855 KOps/s 66.4756 KOps/s $\textbf{\color{#d91a1a}-6.90\%}$
test_creation_nested_1 50.6520μs 18.2118μs 54.9095 KOps/s 59.4158 KOps/s $\textbf{\color{#d91a1a}-7.58\%}$
test_creation_nested_2 51.1930μs 20.9380μs 47.7600 KOps/s 51.8789 KOps/s $\textbf{\color{#d91a1a}-7.94\%}$
test_clone 0.1798ms 28.4857μs 35.1053 KOps/s 36.2477 KOps/s $\color{#d91a1a}-3.15\%$
test_getitem[int] 1.4815ms 16.7492μs 59.7044 KOps/s 63.4607 KOps/s $\textbf{\color{#d91a1a}-5.92\%}$
test_getitem[slice_int] 0.2206ms 29.3005μs 34.1291 KOps/s 36.7128 KOps/s $\textbf{\color{#d91a1a}-7.04\%}$
test_getitem[range] 0.2300ms 0.1084ms 9.2231 KOps/s 9.5048 KOps/s $\color{#d91a1a}-2.96\%$
test_getitem[tuple] 0.1710ms 24.5856μs 40.6742 KOps/s 41.9776 KOps/s $\color{#d91a1a}-3.10\%$
test_getitem[list] 0.3015ms 0.1032ms 9.6944 KOps/s 10.5696 KOps/s $\textbf{\color{#d91a1a}-8.28\%}$
test_setitem_dim[int] 69.5630μs 45.9954μs 21.7413 KOps/s 23.3892 KOps/s $\textbf{\color{#d91a1a}-7.05\%}$
test_setitem_dim[slice_int] 90.1740μs 64.1770μs 15.5819 KOps/s 15.5852 KOps/s $\color{#d91a1a}-0.02\%$
test_setitem_dim[range] 0.2929ms 0.1232ms 8.1158 KOps/s 8.1173 KOps/s $\color{#d91a1a}-0.02\%$
test_setitem_dim[tuple] 0.1855ms 58.6335μs 17.0551 KOps/s 17.1207 KOps/s $\color{#d91a1a}-0.38\%$
test_setitem 0.1947ms 41.2720μs 24.2295 KOps/s 25.1543 KOps/s $\color{#d91a1a}-3.68\%$
test_set 0.2417ms 40.3725μs 24.7693 KOps/s 26.0906 KOps/s $\textbf{\color{#d91a1a}-5.06\%}$
test_set_shared 0.3736ms 50.7034μs 19.7226 KOps/s 20.2806 KOps/s $\color{#d91a1a}-2.75\%$
test_update 0.2013ms 50.8674μs 19.6589 KOps/s 21.5626 KOps/s $\textbf{\color{#d91a1a}-8.83\%}$
test_update_nested 0.2109ms 57.6847μs 17.3356 KOps/s 18.6423 KOps/s $\textbf{\color{#d91a1a}-7.01\%}$
test_update__nested 0.2064ms 57.7681μs 17.3106 KOps/s 17.7235 KOps/s $\color{#d91a1a}-2.33\%$
test_set_nested 0.1851ms 42.9286μs 23.2945 KOps/s 24.5686 KOps/s $\textbf{\color{#d91a1a}-5.19\%}$
test_set_nested_new 0.2280ms 48.1391μs 20.7732 KOps/s 21.5379 KOps/s $\color{#d91a1a}-3.55\%$
test_select 0.2158ms 63.2527μs 15.8096 KOps/s 16.3620 KOps/s $\color{#d91a1a}-3.38\%$
test_select_nested 0.1496ms 42.8957μs 23.3124 KOps/s 23.7041 KOps/s $\color{#d91a1a}-1.65\%$
test_exclude_nested 0.2138ms 57.8137μs 17.2969 KOps/s 16.7747 KOps/s $\color{#35bf28}+3.11\%$
test_empty[True] 0.8496ms 0.2488ms 4.0194 KOps/s 4.0880 KOps/s $\color{#d91a1a}-1.68\%$
test_empty[False] 3.2741μs 0.7541μs 1.3261 MOps/s 1.3544 MOps/s $\color{#d91a1a}-2.09\%$
test_to 52.0420μs 25.6882μs 38.9284 KOps/s 38.5638 KOps/s $\color{#35bf28}+0.95\%$
test_to_nonblocking 49.7120μs 24.7911μs 40.3371 KOps/s 41.5891 KOps/s $\color{#d91a1a}-3.01\%$
test_unbind_speed 1.4540ms 0.2792ms 3.5815 KOps/s 3.6494 KOps/s $\color{#d91a1a}-1.86\%$
test_unbind_speed_stack0 0.3779ms 0.2699ms 3.7051 KOps/s 3.6407 KOps/s $\color{#35bf28}+1.77\%$
test_unbind_speed_stack1 94.0235ms 0.6943ms 1.4402 KOps/s 1.4005 KOps/s $\color{#35bf28}+2.83\%$
test_split 97.0063ms 2.1547ms 464.1080 Ops/s 463.3381 Ops/s $\color{#35bf28}+0.17\%$
test_chunk 97.3370ms 2.1490ms 465.3406 Ops/s 461.0041 Ops/s $\color{#35bf28}+0.94\%$
test_creation[device0] 0.3446ms 0.1248ms 8.0119 KOps/s 8.0007 KOps/s $\color{#35bf28}+0.14\%$
test_creation_from_tensor 0.3706ms 0.1281ms 7.8091 KOps/s 7.8371 KOps/s $\color{#d91a1a}-0.36\%$
test_add_one[memmap_tensor0] 0.2211ms 8.4503μs 118.3393 KOps/s 118.7547 KOps/s $\color{#d91a1a}-0.35\%$
test_contiguous[memmap_tensor0] 42.9530μs 2.2032μs 453.8949 KOps/s 449.5567 KOps/s $\color{#35bf28}+0.96\%$
test_stack[memmap_tensor0] 0.1866ms 6.6943μs 149.3802 KOps/s 148.2036 KOps/s $\color{#35bf28}+0.79\%$
test_memmaptd_index 1.0636ms 0.4188ms 2.3880 KOps/s 2.3630 KOps/s $\color{#35bf28}+1.06\%$
test_memmaptd_index_astensor 0.9723ms 0.4693ms 2.1308 KOps/s 2.0920 KOps/s $\color{#35bf28}+1.85\%$
test_memmaptd_index_op 1.4062ms 0.9979ms 1.0021 KOps/s 1.0094 KOps/s $\color{#d91a1a}-0.72\%$
test_serialize_model 0.1304s 0.1293s 7.7363 Ops/s 7.7227 Ops/s $\color{#35bf28}+0.18\%$
test_serialize_model_pickle 1.5295s 1.2129s 0.8245 Ops/s 0.8242 Ops/s $\color{#35bf28}+0.03\%$
test_serialize_weights 0.1296s 0.1278s 7.8265 Ops/s 6.9929 Ops/s $\textbf{\color{#35bf28}+11.92\%}$
test_serialize_weights_returnearly 0.2409s 62.9966ms 15.8739 Ops/s 18.0906 Ops/s $\textbf{\color{#d91a1a}-12.25\%}$
test_serialize_weights_pickle 1.3456s 1.2183s 0.8208 Ops/s 0.8184 Ops/s $\color{#35bf28}+0.30\%$
test_reshape_pytree 0.1765ms 37.0236μs 27.0098 KOps/s 27.4638 KOps/s $\color{#d91a1a}-1.65\%$
test_reshape_td 0.1490ms 43.1846μs 23.1564 KOps/s 23.3131 KOps/s 8000 $\color{#d91a1a}-0.67\%$
test_view_pytree 0.1828ms 35.7037μs 28.0083 KOps/s 27.8945 KOps/s $\color{#35bf28}+0.41\%$
test_view_td 0.1712ms 47.8991μs 20.8772 KOps/s 21.0855 KOps/s $\color{#d91a1a}-0.99\%$
test_unbind_pytree 0.1680ms 33.7975μs 29.5880 KOps/s 29.3172 KOps/s $\color{#35bf28}+0.92\%$
test_unbind_td 0.3779ms 42.1918μs 23.7013 KOps/s 21.8954 KOps/s $\textbf{\color{#35bf28}+8.25\%}$
test_split_pytree 0.1824ms 46.0047μs 21.7369 KOps/s 20.8308 KOps/s $\color{#35bf28}+4.35\%$
test_split_td 0.6868ms 55.2009μs 18.1156 KOps/s 14.6964 KOps/s $\textbf{\color{#35bf28}+23.27\%}$
test_add_pytree 0.2041ms 55.1244μs 18.1408 KOps/s 17.4836 KOps/s $\color{#35bf28}+3.76\%$
test_add_td 0.2273ms 88.9934μs 11.2368 KOps/s 10.5548 KOps/s $\textbf{\color{#35bf28}+6.46\%}$
test_compile_add_one_nested[tensordict-compile] 0.4096ms 0.2134ms 4.6850 KOps/s 4.8335 KOps/s $\color{#d91a1a}-3.07\%$
test_compile_add_one_nested[tensordict-eager] 0.2999ms 0.1476ms 6.7739 KOps/s 6.6984 KOps/s $\color{#35bf28}+1.13\%$
test_compile_add_one_nested[pytree-compile] 0.2804ms 0.1461ms 6.8445 KOps/s 7.0439 KOps/s $\color{#d91a1a}-2.83\%$
test_compile_add_one_nested[pytree-eager] 0.3340ms 0.1797ms 5.5634 KOps/s 5.6935 KOps/s $\color{#d91a1a}-2.29\%$
test_compile_copy_nested[tensordict-compile] 0.1156ms 21.3035μs 46.9407 KOps/s 46.4320 KOps/s $\color{#35bf28}+1.10\%$
test_compile_copy_nested[tensordict-eager] 0.1139ms 43.5710μs 22.9510 KOps/s 22.4686 KOps/s $\color{#35bf28}+2.15\%$
test_compile_copy_nested[pytree-compile] 0.2059ms 64.9119μs 15.4055 KOps/s 15.4465 KOps/s $\color{#d91a1a}-0.27\%$
test_compile_copy_nested[pytree-eager] 75.2730μs 49.5417μs 20.1850 KOps/s 20.0946 KOps/s $\color{#35bf28}+0.45\%$
test_compile_add_one_flat[tensordict-compile] 0.4643ms 0.3163ms 3.1616 KOps/s 3.1295 KOps/s $\color{#35bf28}+1.03\%$
test_compile_add_one_flat[tensordict-eager] 0.3532ms 0.2040ms 4.9028 KOps/s 4.7973 KOps/s $\color{#35bf28}+2.20\%$
test_compile_add_one_flat[tensorclass-compile] 0.2834ms 0.1257ms 7.9554 KOps/s 7.4440 KOps/s $\textbf{\color{#35bf28}+6.87\%}$
test_compile_add_one_flat[tensorclass-eager] 0.2375ms 60.2676μs 16.5927 KOps/s 15.8154 KOps/s $\color{#35bf28}+4.91\%$
test_compile_add_one_flat[pytree-compile] 0.4616ms 0.3166ms 3.1583 KOps/s 3.0610 KOps/s $\color{#35bf28}+3.18\%$
test_compile_add_one_flat[pytree-eager] 0.7707ms 0.6064ms 1.6490 KOps/s 1.6257 KOps/s $\color{#35bf28}+1.44\%$
test_compile_add_self_flat[tensordict-eager] 0.4192ms 0.2439ms 4.1002 KOps/s 4.0017 KOps/s $\color{#35bf28}+2.46\%$
test_compile_add_self_flat[tensordict-compile] 0.4722ms 0.3177ms 3.1477 KOps/s 3.0764 KOps/s $\color{#35bf28}+2.32\%$
test_compile_add_self_flat[tensorclass-eager] 0.2177ms 68.5848μs 14.5805 KOps/s 13.7529 KOps/s $\textbf{\color{#35bf28}+6.02\%}$
test_compile_add_self_flat[tensorclass-compile] 0.2771ms 0.1280ms 7.8144 KOps/s 7.5402 KOps/s $\color{#35bf28}+3.64\%$
test_compile_add_self_flat[pytree-eager] 0.6993ms 0.5224ms 1.9141 KOps/s 1.8303 KOps/s $\color{#35bf28}+4.58\%$
test_compile_add_self_flat[pytree-compile] 0.4443ms 0.3248ms 3.0790 KOps/s 3.0919 KOps/s $\color{#d91a1a}-0.42\%$
test_compile_copy_flat[tensordict-compile] 0.1713ms 17.8473μs 56.0308 KOps/s 55.3724 KOps/s $\color{#35bf28}+1.19\%$
test_compile_copy_flat[tensordict-eager] 77.2540μs 28.0676μs 35.6282 KOps/s 35.4011 KOps/s $\color{#35bf28}+0.64\%$
test_compile_copy_flat[pytree-compile] 0.1462ms 69.4723μs 14.3942 KOps/s 14.3873 KOps/s $\color{#35bf28}+0.05\%$
test_compile_copy_flat[pytree-eager] 81.2940μs 51.0293μs 19.5966 KOps/s 19.3334 KOps/s $\color{#35bf28}+1.36\%$
test_compile_assign_and_add[tensordict-compile] 2.2899ms 0.8087ms 1.2365 KOps/s 1.1420 KOps/s $\textbf{\color{#35bf28}+8.28\%}$
test_compile_assign_and_add[tensordict-eager] 3.3854ms 3.0782ms 324.8630 Ops/s 329.7606 Ops/s $\color{#d91a1a}-1.49\%$
test_compile_assign_and_add[pytree-compile] 2.2842ms 0.7965ms 1.2555 KOps/s 1.1510 KOps/s $\textbf{\color{#35bf28}+9.08\%}$
test_compile_assign_and_add[pytree-eager] 3.3293ms 3.0713ms 325.5923 Ops/s 331.1644 Ops/s $\color{#d91a1a}-1.68\%$
test_compile_indexing[tensor-tensordict-compile] 0.2564ms 0.1071ms 9.3385 KOps/s 9.4545 KOps/s $\color{#d91a1a}-1.23\%$
test_compile_indexing[tensor-tensordict-eager] 0.2076ms 58.8461μs 16.9935 KOps/s 16.4814 KOps/s $\color{#35bf28}+3.11\%$
test_compile_indexing[tensor-tensorclass-compile] 0.2550ms 0.1008ms 9.9202 KOps/s 9.8937 KOps/s $\color{#35bf28}+0.27\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1919ms 41.2492μs 24.2429 KOps/s 24.1387 KOps/s $\color{#35bf28}+0.43\%$
test_compile_indexing[tensor-pytree-compile] 0.2559ms 0.1021ms 9.7975 KOps/s 9.7907 KOps/s $\color{#35bf28}+0.07\%$
test_compile_indexing[tensor-pytree-eager] 0.1928ms 41.2501μs 24.2424 KOps/s 24.2398 KOps/s $\color{#35bf28}+0.01\%$
test_compile_indexing[slice-tensordict-compile] 0.3178ms 0.1372ms 7.2877 KOps/s 7.3846 KOps/s $\color{#d91a1a}-1.31\%$
test_compile_indexing[slice-tensordict-eager] 0.1584ms 24.8924μs 40.1729 KOps/s 38.6626 KOps/s $\color{#35bf28}+3.91\%$
test_compile_indexing[slice-tensorclass-compile] 0.2859ms 0.1339ms 7.4656 KOps/s 7.7550 KOps/s $\color{#d91a1a}-3.73\%$
test_compile_indexing[slice-tensorclass-eager] 0.1149ms 20.3419μs 49.1596 KOps/s 48.5266 KOps/s $\color{#35bf28}+1.30\%$
test_compile_indexing[slice-pytree-compile] 0.2855ms 0.1310ms 7.6364 KOps/s 7.5353 KOps/s $\color{#35bf28}+1.34\%$
test_compile_indexing[slice-pytree-eager] 0.1195ms 20.4230μs 48.9643 KOps/s 48.8796 KOps/s $\color{#35bf28}+0.17\%$
test_compile_indexing[int-tensordict-compile] 0.3170ms 0.1372ms 7.2898 KOps/s 7.0406 KOps/s $\color{#35bf28}+3.54\%$
test_compile_indexing[int-tensordict-eager] 0.4933ms 24.4453μs 40.9076 KOps/s 39.4904 KOps/s $\color{#35bf28}+3.59\%$
test_compile_indexing[int-tensorclass-compile] 0.3313ms 0.1313ms 7.6169 KOps/s 7.4939 KOps/s $\color{#35bf28}+1.64\%$
test_compile_indexing[int-tensorclass-eager] 0.1799ms 23.6920μs 42.2083 KOps/s 48.6508 KOps/s $\textbf{\color{#d91a1a}-13.24\%}$
test_compile_indexing[int-pytree-compile] 0.2824ms 0.1310ms 7.6345 KOps/s 7.7097 KOps/s $\color{#d91a1a}-0.98\%$
test_compile_indexing[int-pytree-eager] 56.2230μs 20.5397μs 48.6861 KOps/s 48.7587 KOps/s $\color{#d91a1a}-0.15\%$
test_mod_add[eager] 0.1832ms 31.2754μs 31.9740 KOps/s 32.7376 KOps/s $\color{#d91a1a}-2.33\%$
test_mod_add[compile] 0.2194ms 68.9088μs 14.5119 KOps/s 14.2187 KOps/s $\color{#35bf28}+2.06\%$
test_mod_add[compile-overhead] 0.2575ms 0.1336ms 7.4827 KOps/s 6.6201 KOps/s $\textbf{\color{#35bf28}+13.03\%}$
test_mod_wrap[eager] 0.3832ms 0.2298ms 4.3515 KOps/s 4.2702 KOps/s $\color{#35bf28}+1.90\%$
test_mod_wrap[compile] 1.5016ms 0.3006ms 3.3262 KOps/s 3.4022 KOps/s $\color{#d91a1a}-2.24\%$
test_mod_wrap[compile-overhead] 7.5702ms 4.0276ms 248.2868 Ops/s 249.3961 Ops/s $\color{#d91a1a}-0.44\%$
test_mod_wrap_and_backward[eager] 1.4561ms 1.2854ms 777.9910 Ops/s 740.6176 Ops/s $\textbf{\color{#35bf28}+5.05\%}$
test_mod_wrap_and_backward[compile] 1.5465ms 1.2886ms 776.0194 Ops/s 775.1374 Ops/s $\color{#35bf28}+0.11\%$
test_mod_wrap_and_backward[compile-overhead] 1.3311ms 0.8913ms 1.1219 KOps/s 1.1175 KOps/s $\color{#35bf28}+0.40\%$
test_seq_add[eager] 0.2849ms 0.1012ms 9.8769 KOps/s 10.6115 KOps/s $\textbf{\color{#d91a1a}-6.92\%}$
test_seq_add[compile] 0.2510ms 84.7810μs 11.7951 KOps/s 12.3600 KOps/s $\color{#d91a1a}-4.57\%$
test_seq_add[compile-overhead] 0.2773ms 0.1161ms 8.6145 KOps/s 8.9524 KOps/s $\color{#d91a1a}-3.78\%$
test_seq_wrap[eager] 0.5582ms 0.3923ms 2.5488 KOps/s 2.7163 KOps/s $\textbf{\color{#d91a1a}-6.16\%}$
test_seq_wrap[compile] 0.5085ms 0.3274ms 3.0541 KOps/s 3.1935 KOps/s $\color{#d91a1a}-4.36\%$
test_seq_wrap[compile-overhead] 0.3869ms 0.2278ms 4.3898 KOps/s 4.5184 KOps/s $\color{#d91a1a}-2.85\%$
test_func_call_runtime[False-eager] 0.9430ms 0.7563ms 1.3222 KOps/s 1.3432 KOps/s $\color{#d91a1a}-1.56\%$
test_func_call_runtime[False-compile] 1.0217ms 0.7683ms 1.3016 KOps/s 1.2872 KOps/s $\color{#35bf28}+1.12\%$
test_func_call_runtime[False-compile-overhead] 0.4807ms 0.3559ms 2.8094 KOps/s 2.7809 KOps/s $\color{#35bf28}+1.03\%$
test_func_call_runtime[True-eager] 1.1055ms 0.8735ms 1.1448 KOps/s 1.1552 KOps/s $\color{#d91a1a}-0.90\%$
test_func_call_runtime[True-compile] 1.0142ms 0.8225ms 1.2158 KOps/s 1.2365 KOps/s $\color{#d91a1a}-1.67\%$
test_func_call_runtime[True-compile-overhead] 0.5132ms 0.3907ms 2.5597 KOps/s 2.5519 KOps/s $\color{#35bf28}+0.31\%$
test_func_call_cm_runtime[False-eager] 0.8695ms 0.7061ms 1.4163 KOps/s 1.4149 KOps/s $\color{#35bf28}+0.10\%$
test_func_call_cm_runtime[False-compile] 0.9402ms 0.7735ms 1.2928 KOps/s 1.2287 KOps/s $\textbf{\color{#35bf28}+5.22\%}$
test_func_call_cm_runtime[False-compile-overhead] 0.4842ms 0.3586ms 2.7888 KOps/s 2.7747 KOps/s $\color{#35bf28}+0.51\%$
test_func_call_cm_runtime[True-eager] 1.1305ms 0.9589ms 1.0428 KOps/s 1.0382 KOps/s $\color{#35bf28}+0.44\%$
test_func_call_cm_runtime[True-compile] 1.0475ms 0.8569ms 1.1670 KOps/s 1.1847 KOps/s $\color{#d91a1a}-1.50\%$
test_func_call_cm_runtime[True-compile-overhead] 0.5635ms 0.4165ms 2.4008 KOps/s 2.3524 KOps/s $\color{#35bf28}+2.06\%$
test_vmap_func_call_cm_runtime[eager] 2.5252ms 1.9863ms 503.4552 Ops/s 496.6441 Ops/s $\color{#35bf28}+1.37\%$
test_vmap_func_call_cm_runtime[compile] 1.0081ms 0.8437ms 1.1852 KOps/s 1.1598 KOps/s $\color{#35bf28}+2.20\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5451ms 0.4226ms 2.3661 KOps/s 2.3485 KOps/s $\color{#35bf28}+0.75\%$
test_distributed 3.3649ms 0.3370ms 2.9677 KOps/s 8.3549 KOps/s $\textbf{\color{#d91a1a}-64.48\%}$
test_tdmodule 40.7720μs 15.3787μs 65.0250 KOps/s 64.5712 KOps/s $\color{#35bf28}+0.70\%$
test_tdmodule_dispatch 69.2330μs 30.4011μs 32.8935 KOps/s 33.9440 KOps/s $\color{#d91a1a}-3.09\%$
test_tdseq 34.6720μs 16.5204μs 60.5313 KOps/s 65.4617 KOps/s $\textbf{\color{#d91a1a}-7.53\%}$
test_tdseq_dispatch 53.8330μs 32.2493μs 31.0084 KOps/s 32.7590 KOps/s $\textbf{\color{#d91a1a}-5.34\%}$
test_instantiation_functorch 1.9966ms 1.8365ms 544.5024 Ops/s 540.9122 Ops/s $\color{#35bf28}+0.66\%$
test_instantiation_td 1.8157ms 1.1829ms 845.3810 Ops/s 837.6897 Ops/s $\color{#35bf28}+0.92\%$
test_exec_functorch 0.3317ms 0.2056ms 4.8630 KOps/s 4.8998 KOps/s $\color{#d91a1a}-0.75\%$
test_exec_functional_call 0.3733ms 0.2021ms 4.9489 KOps/s 4.8124 KOps/s $\color{#35bf28}+2.84\%$
test_exec_td 0.3569ms 0.2074ms 4.8211 KOps/s 4.5611 KOps/s $\textbf{\color{#35bf28}+5.70\%}$
test_exec_td_decorator 0.9669ms 0.2594ms 3.8557 KOps/s 3.7657 KOps/s $\color{#35bf28}+2.39\%$
test_vmap_mlp_speed[True-True] 0.8668ms 0.6776ms 1.4759 KOps/s 1.4669 KOps/s $\color{#35bf28}+0.61\%$
test_vmap_mlp_speed[True-False] 0.8089ms 0.6627ms 1.5090 KOps/s 1.4781 KOps/s $\color{#35bf28}+2.09\%$
test_vmap_mlp_speed[False-True] 0.7453ms 0.5671ms 1.7635 KOps/s 1.7735 KOps/s $\color{#d91a1a}-0.56\%$
test_vmap_mlp_speed[False-False] 0.7521ms 0.5655ms 1.7684 KOps/s 1.6961 KOps/s $\color{#35bf28}+4.26\%$
test_vmap_mlp_speed_decorator[True-True] 1.2289ms 0.6507ms 1.5369 KOps/s 1.4623 KOps/s $\textbf{\color{#35bf28}+5.10\%}$
test_vmap_mlp_speed_decorator[True-False] 0.7921ms 0.6495ms 1.5396 KOps/s 1.5244 KOps/s $\color{#35bf28}+1.00\%$
test_vmap_mlp_speed_decorator[False-True] 0.7635ms 0.5804ms 1.7228 KOps/s 1.6694 KOps/s $\color{#35bf28}+3.20\%$
test_vmap_mlp_speed_decorator[False-False] 0.7896ms 0.5904ms 1.6937 KOps/s 1.6517 KOps/s $\color{#35bf28}+2.55\%$
test_vmap_transformer_speed[True-True] 8.3849ms 8.0621ms 124.0365 Ops/s 123.0466 Ops/s $\color{#35bf28}+0.80\%$
test_vmap_transformer_speed[True-False] 8.1922ms 8.0295ms 124.5402 Ops/s 123.4041 Ops/s $\color{#35bf28}+0.92\%$
test_vmap_transformer_speed[False-True] 7.9850ms 7.8511ms 127.3707 Ops/s 126.6256 Ops/s $\color{#35bf28}+0.59\%$
test_vmap_transformer_speed[False-False] 8.1848ms 7.8667ms 127.1179 Ops/s 12 8F26 4.4428 Ops/s $\color{#35bf28}+2.15\%$
test_vmap_transformer_speed_decorator[True-True] 19.5401ms 18.7929ms 53.2115 Ops/s 52.2266 Ops/s $\color{#35bf28}+1.89\%$
test_vmap_transformer_speed_decorator[True-False] 19.5839ms 18.8969ms 52.9187 Ops/s 51.7919 Ops/s $\color{#35bf28}+2.18\%$
test_vmap_transformer_speed_decorator[False-True] 19.3944ms 18.7137ms 53.4368 Ops/s 51.8872 Ops/s $\color{#35bf28}+2.99\%$
test_vmap_transformer_speed_decorator[False-False] 19.5702ms 18.7524ms 53.3266 Ops/s 53.4875 Ops/s $\color{#d91a1a}-0.30\%$
test_to_module_speed[True] 1.4198ms 0.9416ms 1.0621 KOps/s 1.0489 KOps/s $\color{#35bf28}+1.26\%$
test_to_module_speed[False] 1.2987ms 0.9186ms 1.0886 KOps/s 1.0811 KOps/s $\color{#35bf28}+0.70\%$
test_tc_init 62.1530μs 34.9086μs 28.6462 KOps/s 30.7669 KOps/s $\textbf{\color{#d91a1a}-6.89\%}$
test_tc_init_nested 0.1432ms 68.7514μs 14.5452 KOps/s 14.6878 KOps/s $\color{#d91a1a}-0.97\%$
test_tc_first_layer_tensor 4.3230μs 0.6787μs 1.4735 MOps/s 1.4654 MOps/s $\color{#35bf28}+0.55\%$
test_tc_first_layer_nontensor 24.2110μs 2.2523μs 443.9959 KOps/s 446.4475 KOps/s $\color{#d91a1a}-0.55\%$
test_tc_second_layer_tensor 29.4140μs 1.3594μs 735.6156 KOps/s 715.8035 KOps/s $\color{#35bf28}+2.77\%$
test_tc_second_layer_nontensor 27.2810μs 2.9549μs 338.4194 KOps/s 338.3090 KOps/s $\color{#35bf28}+0.03\%$
test_unbind 0.1965s 12.2761ms 81.4588 Ops/s 92.2363 Ops/s $\textbf{\color{#d91a1a}-11.68\%}$
test_full_like 0.8052ms 0.5762ms 1.7355 KOps/s 1.7338 KOps/s $\color{#35bf28}+0.10\%$
test_zeros_like 0.3354ms 0.1982ms 5.0444 KOps/s 5.0515 KOps/s $\color{#d91a1a}-0.14\%$
test_ones_like 0.3691ms 0.1981ms 5.0484 KOps/s 5.0500 KOps/s $\color{#d91a1a}-0.03\%$
test_clone 0.5998ms 0.4152ms 2.4085 KOps/s 2.4103 KOps/s $\color{#d91a1a}-0.07\%$
test_squeeze 0.1096ms 9.8156μs 101.8784 KOps/s 101.0757 KOps/s $\color{#35bf28}+0.79\%$
test_unsqueeze 0.2236ms 75.0070μs 13.3321 KOps/s 13.2795 KOps/s $\color{#35bf28}+0.40\%$
test_split 0.4165ms 0.1588ms 6.2989 KOps/s 6.2687 KOps/s $\color{#35bf28}+0.48\%$
test_permute 0.2906ms 0.1788ms 5.5913 KOps/s 5.6165 KOps/s $\color{#d91a1a}-0.45\%$
test_stack 1.4035ms 0.8673ms 1.1530 KOps/s 1.1389 KOps/s $\color{#35bf28}+1.24\%$
test_cat 1.3950ms 1.2324ms 811.4574 Ops/s 811.8503 Ops/s $\color{#d91a1a}-0.05\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0