8000 [BugFix] Faster and safer non-tensor stack by vmoens · Pull Request #1232 · pytorch/tensordict · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

[BugFix] Faster and safer non-tensor stack #1232

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 24, 2025

Conversation

vmoens
Copy link
Collaborator
@vmoens vmoens commented Feb 24, 2025

[ghstack-poisoned]
vmoens pushed a commit that referenced this pull request Feb 24, 2025
ghstack-source-id: f6c61cd
Pull Request resolved: #1232
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 24, 2025
@vmoens vmoens merged commit 282aae9 into gh/vmoens/48/base Feb 24, 2025
23 of 35 checks passed
@vmoens vmoens deleted the gh/vmoens/48/head branch February 24, 2025 14:49
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}8$. Worsened: $\large\color{#d91a1a}6$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 43.8330μs 20.5702μs 48.6139 KOps/s 47.9152 KOps/s $\color{#35bf28}+1.46\%$
test_plain_set_stack_nested 50.2340μs 20.8808μs 47.8908 KOps/s 47.6908 KOps/s $\color{#35bf28}+0.42\%$
test_plain_set_nested_inplace 58.1190μs 22.8923μs 43.6829 KOps/s 43.3637 KOps/s $\color{#35bf28}+0.74\%$
test_plain_set_stack_nested_inplace 49.9930μs 22.8930μs 43.6814 KOps/s 43.7785 KOps/s $\color{#d91a1a}-0.22\%$
test_items 24.1560μs 4.1983μs 238.1937 KOps/s 240.4976 KOps/s $\color{#d91a1a}-0.96\%$
test_items_nested 0.6107ms 0.4112ms 2.4320 KOps/s 2.4507 KOps/s $\color{#d91a1a}-0.76\%$
test_items_nested_locked 0.8519ms 0.4101ms 2.4384 KOps/s 2.4449 KOps/s $\color{#d91a1a}-0.27\%$
test_items_nested_leaf 0.1327ms 77.2905μs 12.9382 KOps/s 13.0288 KOps/s $\color{#d91a1a}-0.70\%$
test_items_stack_nested 0.5859ms 0.4125ms 2.4244 KOps/s 2.4066 KOps/s $\color{#35bf28}+0.74\%$
test_items_stack_nested_leaf 0.1497ms 77.6208μs 12.8831 KOps/s 12.8733 KOps/s $\color{#35bf28}+0.08\%$
test_items_stack_nested_locked 0.6894ms 0.4104ms 2.4368 KOps/s 2.4494 KOps/s $\color{#d91a1a}-0.52\%$
test_keys 38.7630μs 3.4557μs 289.3746 KOps/s 289.6163 KOps/s $\color{#d91a1a}-0.08\%$
test_keys_nested 0.2692ms 0.1666ms 6.0025 KOps/s 6.0645 KOps/s $\color{#d91a1a}-1.02\%$
test_keys_nested_locked 1.8315ms 0.1748ms 5.7210 KOps/s 5.8503 KOps/s $\color{#d91a1a}-2.21\%$
test_keys_nested_leaf 0.2471ms 0.1462ms 6.8392 KOps/s 6.9287 KOps/s $\color{#d91a1a}-1.29\%$
test_keys_stack_nested 0.2707ms 0.1668ms 5.9962 KOps/s 5.9786 KOps/s $\color{#35bf28}+0.29\%$
test_keys_stack_nested_leaf 0.2605ms 0.1462ms 6.8415 KOps/s 6.9416 KOps/s $\color{#d91a1a}-1.44\%$
test_keys_stack_nested_locked 0.2741ms 0.1725ms 5.7983 KOps/s 5.8178 KOps/s $\color{#d91a1a}-0.34\%$
test_values 5.1976μs 1.0562μs 946.7937 KOps/s 957.7968 KOps/s $\color{#d91a1a}-1.15\%$
test_values_nested 0.1129ms 63.2184μs 15.8182 KOps/s 15.9557 KOps/s $\color{#d91a1a}-0.86\%$
test_values_nested_locked 0.1521ms 63.1989μs 15.8231 KOps/s 15.8933 KOps/s $\color{#d91a1a}-0.44\%$
test_values_nested_leaf 0.1232ms 72.0065μs 13.8876 KOps/s 13.8911 KOps/s $\color{#d91a1a}-0.03\%$
test_values_stack_nested 0.1311ms 63.7311μs 15.6909 KOps/s 15.8585 KOps/s $\color{#d91a1a}-1.06\%$
test_values_stack_nested_leaf 0.1281ms 72.2092μs 13.8486 KOps/s 13.5612 KOps/s $\color{#35bf28}+2.12\%$
test_values_stack_nested_locked 0.1172ms 63.3473μs 15.7860 KOps/s 16.0324 KOps/s $\color{#d91a1a}-1.54\%$
test_membership 16.4800μs 0.8665μs 1.1541 MOps/s 1.1591 MOps/s $\color{#d91a1a}-0.43\%$
test_membership_nested 31.6290μs 2.8685μs 348.6094 KOps/s 351.1008 KOps/s $\color{#d91a1a}-0.71\%$
test_membership_nested_leaf 23.3630μs 2.9100μs 343.6367 KOps/s 346.6222 KOps/s $\color{#d91a1a}-0.86\%$
test_membership_stacked_nested 16.4710μs 2.8894μs 346.0904 KOps/s 349.5656 KOps/s $\color{#d91a1a}-0.99\%$
test_membership_stacked_nested_leaf 48.6010μs 2.8539μs 350.4014 KOps/s 349.7582 KOps/s $\color{#35bf28}+0.18\%$
test_membership_nested_last 31.0480μs 4.3617μs 229.2698 KOps/s 227.3615 KOps/s $\color{#35bf28}+0.84\%$
test_membership_nested_leaf_last 27.1910μs 4.3913μs 227.7230 KOps/s 231.9385 KOps/s $\color{#d91a1a}-1.82\%$
test_membership_stacked_nested_last 28.5440μs 4.4368μs 225.3888 KOps/s 229.1118 KOps/s $\color{#d91a1a}-1.62\%$
test_membership_stacked_nested_leaf_last 27.1410μs 4.3378μs 230.5308 KOps/s 231.3882 KOps/s $\color{#d91a1a}-0.37\%$
test_nested_getleaf 54.0190μs 10.6318μs 94.0575 KOps/s 92.9330 KOps/s $\color{#35bf28}+1.21\%$
test_nested_get 46.0570μs 10.0639μs 99.3655 KOps/s 97.3968 KOps/s $\color{#35bf28}+2.02\%$
test_stacked_getleaf 41.9890μs 10.6077μs 94.2711 KOps/s 93.4146 KOps/s $\color{#35bf28}+0.92\%$
test_stacked_get 38.7430μs 10.0135μs 99.8654 KOps/s 97.4105 KOps/s $\color{#35bf28}+2.52\%$
test_nested_getitemleaf 34.3440μs 11.4377μs 87.4301 KOps/s 87.9445 KOps/s $\color{#d91a1a}-0.58\%$
test_nested_getitem 41.1170μs 10.7690μs 92.8587 KOps/s 92.3582 KOps/s $\color{#35bf28}+0.54\%$
tes 8000 t_stacked_getitemleaf 41.7380μs 11.2172μs 89.1487 KOps/s 89.3054 KOps/s $\color{#d91a1a}-0.18\%$
test_stacked_getitem 43.6320μs 10.7809μs 92.7562 KOps/s 93.8362 KOps/s $\color{#d91a1a}-1.15\%$
test_lock_nested 0.7466ms 0.4232ms 2.3627 KOps/s 2.4583 KOps/s $\color{#d91a1a}-3.89\%$
test_lock_stack_nested 0.6742ms 0.4328ms 2.3105 KOps/s 2.3568 KOps/s $\color{#d91a1a}-1.97\%$
test_unlock_nested 0.5023ms 0.3495ms 2.8613 KOps/s 3.0199 KOps/s $\textbf{\color{#d91a1a}-5.25\%}$
test_unlock_stack_nested 0.6365ms 0.3519ms 2.8420 KOps/s 2.9324 KOps/s $\color{#d91a1a}-3.08\%$
test_flatten_speed 0.1770ms 0.1011ms 9.8867 KOps/s 9.9926 KOps/s $\color{#d91a1a}-1.06\%$
test_unflatten_speed 0.7175ms 0.5284ms 1.8924 KOps/s 1.9013 KOps/s $\color{#d91a1a}-0.47\%$
test_common_ops 1.3362ms 0.8154ms 1.2264 KOps/s 1.2075 KOps/s $\color{#35bf28}+1.57\%$
test_creation 29.7860μs 2.4584μs 406.7655 KOps/s 403.5105 KOps/s $\color{#35bf28}+0.81\%$
test_creation_empty 44.4730μs 12.1964μs 81.9914 KOps/s 82.7809 KOps/s $\color{#d91a1a}-0.95\%$
test_creation_nested_1 53.6710μs 15.2923μs 65.3922 KOps/s 67.1445 KOps/s $\color{#d91a1a}-2.61\%$
test_creation_nested_2 46.5080μs 19.9933μs 50.0169 KOps/s 51.4949 KOps/s $\color{#d91a1a}-2.87\%$
test_clone 54.6120μs 13.7608μs 72.6704 KOps/s 72.3065 KOps/s $\color{#35bf28}+0.50\%$
test_getitem[int] 0.7490ms 12.9701μs 77.1003 KOps/s 78.5518 KOps/s $\color{#d91a1a}-1.85\%$
test_getitem[slice_int] 0.1309ms 25.0781μs 39.8755 KOps/s 40.8652 KOps/s $\color{#d91a1a}-2.42\%$
test_getitem[range] 0.1741ms 49.2660μs 20.2980 KOps/s 19.7576 KOps/s $\color{#35bf28}+2.74\%$
test_getitem[tuple] 0.1242ms 20.4503μs 48.8991 KOps/s 49.1822 KOps/s $\color{#d91a1a}-0.58\%$
test_getitem[list] 0.1679ms 45.5225μs 21.9672 KOps/s 22.1838 KOps/s $\color{#d91a1a}-0.98\%$
test_setitem_dim[int] 60.4230μs 26.3104μs 38.0078 KOps/s 37.8874 KOps/s $\color{#35bf28}+0.32\%$
test_setitem_dim[slice_int] 83.7370μs 52.4830μs 19.0538 KOps/s 19.5120 KOps/s $\color{#d91a1a}-2.35\%$
test_setitem_dim[range] 0.1925ms 77.9214μs 12.8334 KOps/s 12.9758 KOps/s $\color{#d91a1a}-1.10\%$
test_setitem_dim[tuple] 76.3540μs 41.8956μs 23.8688 KOps/s 24.3641 KOps/s $\color{#d91a1a}-2.03\%$
test_setitem 57.1270μs 21.0313μs 47.5481 KOps/s 47.6606 KOps/s $\color{#d91a1a}-0.24\%$
test_set 0.1020ms 20.6976μs 48.3149 KOps/s 48.9220 KOps/s $\color{#d91a1a}-1.24\%$
test_set_shared 3.8428ms 0.1880ms 5.3183 KOps/s 5.3966 KOps/s $\color{#d91a1a}-1.45\%$
test_update 0.1185ms 23.4291μs 42.6820 KOps/s 42.0942 KOps/s $\color{#35bf28}+1.40\%$
test_update_nested 95.1590μs 34.3418μs 29.1190 KOps/s 28.4994 KOps/s $\color{#35bf28}+2.17\%$
test_update__nested 0.4555ms 34.3681μs 29.0968 KOps/s 29.1196 KOps/s $\color{#d91a1a}-0.08\%$
test_set_nested 0.1100ms 22.8921μs 43.6833 KOps/s 44.1554 KOps/s $\color{#d91a1a}-1.07\%$
test_set_nested_new 66.1040μs 27.9520μs 35.7756 KOps/s 36.4505 KOps/s $\color{#d91a1a}-1.85\%$
test_select 0.1004ms 44.4263μs 22.5092 KOps/s 22.8852 KOps/s $\color{#d91a1a}-1.64\%$
test_select_nested 0.1240ms 65.3240μs 15.3083 KOps/s 15.5600 KOps/s $\color{#d91a1a}-1.62\%$
test_exclude_nested 0.1701ms 84.6253μs 11.8168 KOps/s 11.9142 KOps/s $\color{#d91a1a}-0.82\%$
test_empty[True] 0.5549ms 0.4151ms 2.4091 KOps/s 2.4281 KOps/s $\color{#d91a1a}-0.78\%$
test_empty[False] 6.8878μs 1.3733μs 728.1843 KOps/s 715.8156 KOps/s $\color{#35bf28}+1.73\%$
test_unbind_speed 0.3950ms 0.2736ms 3.6551 KOps/s 3.6674 KOps/s $\color{#d91a1a}-0.33\%$
test_unbind_speed_stack0 0.5488ms 0.2718ms 3.6786 KOps/s 3.7670 KOps/s $\color{#d91a1a}-2.35\%$
test_unbind_speed_stack1 0.1123s 0.7411ms 1.3493 KOps/s 1.2389 KOps/s $\textbf{\color{#35bf28}+8.91\%}$
test_split 98.5888ms 1.7952ms 557.0298 Ops/s 625.4581 Ops/s $\textbf{\color{#d91a1a}-10.94\%}$
test_chunk 0.1130s 1.8343ms 545.1592 Ops/s 512.4437 Ops/s $\textbf{\color{#35bf28}+6.38\%}$
test_consolidate_njt[False-None] 9.1186ms 8.3377ms 119.9372 Ops/s 120.0712 Ops/s $\color{#d91a1a}-0.11\%$
test_creation[device0] 0.2763ms 92.8319μs 10.7722 KOps/s 10.6127 KOps/s $\color{#35bf28}+1.50\%$
test_creation_from_tensor 4.0094ms 96.4121μs 10.3721 KOps/s 10.4493 KOps/s $\color{#d91a1a}-0.74\%$
test_add_one[memmap_tensor0] 0.1275ms 5.2010μs 192.2722 KOps/s 194.4549 KOps/s $\color{#d91a1a}-1.12\%$
test_contiguous[memmap_tensor0] 10.7400μs 0.5009μs 1.9965 MOps/s 1.9357 MOps/s $\color{#35bf28}+3.14\%$
test_stack[memmap_tensor0] 27.4720μs 3.4348μs 291.1359 KOps/s 293.3070 KOps/s $\color{#d91a1a}-0.74\%$
test_memmaptd_index 1.2941ms 0.2310ms 4.3286 KOps/s 4.3878 KOps/s $\color{#d91a1a}-1.35\%$
test_memmaptd_index_astensor 0.6679ms 0.3183ms 3.1420 KOps/s 3.1843 KOps/s $\color{#d91a1a}-1.33\%$
test_memmaptd_index_op 0.9609ms 0.6206ms 1.6114 KOps/s 1.6811 KOps/s $\color{#d91a1a}-4.14\%$
test_serialize_model 0.1240s 0.1159s 8.6291 Ops/s 8.6476 Ops/s $\color{#d91a1a}-0.21\%$
test_serialize_model_pickle 0.4685s 0.3906s 2.5605 Ops/s 2.5089 Ops/s $\color{#35bf28}+2.05\%$
test_serialize_weights 0.1254s 0.1156s 8.6505 Ops/s 7.8020 Ops/s $\textbf{\color{#35bf28}+10.88\%}$
test_serialize_weights_returnearly 0.1850s 0.1636s 6.1137 Ops/s 6.2801 Ops/s $\color{#d91a1a}-2.65\%$
test_serialize_weights_pickle 1.1616s 0.7030s 1.4225 Ops/s 2.5201 Ops/s $\textbf{\color{#d91a1a}-43.55\%}$
test_serialize_weights_filesystem 0.1451s 0.1403s 7.1268 Ops/s 7.0511 Ops/s $\color{#35bf28}+1.07\%$
test_serialize_model_filesystem 0.2399s 0.1541s 6.4874 Ops/s 6.4922 Ops/s $\color{#d91a1a}-0.07\%$
test_reshape_pytree 60.7240μs 26.3781μs 37.9102 KOps/s 38.0469 KOps/s $\color{#d91a1a}-0.36\%$
test_reshape_td 88.1750μs 32.7154μs 30.5666 KOps/s 30.1058 KOps/s $\color{#35bf28}+1.53\%$
test_view_pytree 59.9730μs 26.2134μs 38.1484 KOps/s 38.1191 KOps/s $\color{#35bf28}+0.08\%$
test_view_td 98.4550μs 42.1176μs 23.7431 KOps/s 24.1750 KOps/s $\color{#d91a1a}-1.79\%$
test_unbind_pytree 75.9530μs 29.4339μs 33.9745 KOps/s 33.6604 KOps/s $\color{#35bf28}+0.93\%$
test_unbind_td 0.3495ms 40.0170μs 24.9894 KOps/s 25.0197 KOps/s $\color{#d91a1a}-0.12\%$
test_split_pytree 0.1032ms 29.1306μs 34.3282 KOps/s 34.0619 KOps/s $\color{#35bf28}+0.78\%$
test_split_td 0.6434ms 46.2203μs 21.6355 KOps/s 22.0553 KOps/s $\color{#d91a1a}-1.90\%$
test_add_pytree 70.2920μs 36.7763μs 27.1914 KOps/s 27.5686 KOps/s $\color{#d91a1a}-1.37\%$
test_add_td 0.1386ms 61.5443μs 16.2485 KOps/s 17.8282 KOps/s $\textbf{\color{#d91a1a}-8.86\%}$
test_compile_add_one_nested[tensordict-compile] 0.1606ms 66.3681μs 15.0675 KOps/s 15.1469 KOps/s $\color{#d91a1a}-0.52\%$
test_compile_add_one_nested[tensordict-eager] 1.4006ms 0.1760ms 5.6822 KOps/s 5.7711 KOps/s $\color{#d91a1a}-1.54\%$
test_compile_add_one_nested[pytree-compile] 0.1032ms 45.2975μs 22.0763 KOps/s 21.9642 KOps/s $\color{#35bf28}+0.51\%$
test_compile_add_one_nested[pytree-eager] 0.2332ms 0.1203ms 8.3097 KOps/s 8.3399 KOps/s $\color{#d91a1a}-0.36\%$
test_compile_copy_nested[tensordict-compile] 85.3900μs 26.8241μs 37.2799 KOps/s 36.0885 KOps/s $\color{#35bf28}+3.30\%$
test_compile_copy_nested[tensordict-eager] 0.1221ms 59.1333μs 16.9110 KOps/s 17.1573 KOps/s $\color{#d91a1a}-1.44\%$
test_compile_copy_nested[pytree-compile] 0.1680ms 79.3579μs 12.6011 KOps/s 12.6255 KOps/s $\color{#d91a1a}-0.19\%$
test_compile_copy_nested[pytree-eager] 0.1246ms 67.4784μs 14.8196 KOps/s 14.8697 KOps/s $\color{#d91a1a}-0.34\%$
test_compile_add_one_flat[tensordict-compile] 0.2194ms 0.1058ms 9.4526 KOps/s 9.3547 KOps/s $\color{#35bf28}+1.05\%$
test_compile_add_one_flat[tensordict-eager] 0.4989ms 0.2152ms 4.6470 KOps/s 4.5833 KOps/s $\color{#35bf28}+1.39\%$
test_compile_add_one_flat[tensorclass-compile] 0.1290ms 47.8414μs 20.9024 KOps/s 21.6547 KOps/s $\color{#d91a1a}-3.47\%$
test_compile_add_one_flat[tensorclass-eager] 0.1919ms 67.5368μs 14.8067 KOps/s 14.6407 KOps/s $\color{#35bf28}+1.13\%$
test_compile_add_one_flat[pytree-compile] 0.2354ms 99.9211μs 10.0079 KOps/s 9.9814 KOps/s $\color{#35bf28}+0.27\%$
test_compile_add_one_flat[pytree-eager] 0.4720ms 0.2020ms 4.9514 KOps/s 4.9585 KOps/s $\color{#d91a1a}-0.14\%$
test_compile_add_self_flat[tensordict-eager] 0.3902ms 0.2312ms 4.3253 KOps/s 4.3144 KOps/s $\color{#35bf28}+0.25\%$
test_compile_add_self_flat[tensordict-compile] 0.2367ms 0.1087ms 9.2029 KOps/s 9.3005 KOps/s $\color{#d91a1a}-1.05\%$
test_compile_add_self_flat[tensorclass-eager] 0.3005ms 63.4209μs 15.7677 KOps/s 15.9229 KOps/s $\color{#d91a1a}-0.98\%$
test_compile_add_self_flat[tensorclass-compile] 0.5247ms 49.3746μs 20.2533 KOps/s 20.7969 KOps/s $\color{#d91a1a}-2.61\%$
test_compile_add_self_flat[pytree-eager] 0.2591ms 0.1566ms 6.3852 KOps/s 6.3147 KOps/s $\color{#35bf28}+1.12\%$
test_compile_add_self_flat[pytree-compile] 0.2212ms 0.1017ms 9.8286 KOps/s 9.9383 KOps/s $\color{#d91a1a}-1.10\%$
test_compile_copy_flat[tensordict-compile] 82.2440μs 20.6823μs 48.3505 KOps/s 46.6997 KOps/s $\color{#35bf28}+3.53\%$
test_compile_copy_flat[tensordict-eager] 0.1685ms 67.1705μs 14.8875 KOps/s 14.8538 KOps/s $\color{#35bf28}+0.23\%$
test_compile_copy_flat[pytree-compile] 0.1658ms 80.7140μs 12.3894 KOps/s 12.2672 KOps/s $\color{#35bf28}+1.00\%$
test_compile_copy_flat[pytree-eager] 0.1431ms 66.8965μs 14.9485 KOps/s 14.9086 KOps/s $\color{#35bf28}+0.27\%$
test_compile_assign_and_add[tensordict-compile] 0.5141ms 0.2183ms 4.5813 KOps/s 4.5560 KOps/s $\color{#35bf28}+0.55\%$
test_compile_assign_and_add[tensordict-eager] 1.8633ms 1.3806ms 724.3075 Ops/s 730.5316 Ops/s $\color{#d91a1a}-0.85\%$
test_compile_assign_and_add[pytree-compile] 0.4737ms 0.2127ms 4.7004 KOps/s 4.7065 KOps/s $\color{#d91a1a}-0.13\%$
test_compile_assign_and_add[pytree-eager] 1.4462ms 0.8125ms 1.2308 KOps/s 1.2172 KOps/s $\color{#35bf28}+1.11\%$
test_compile_assign_and_add_stack[compile] 0.5906ms 0.4590ms 2.1787 KOps/s 2.1888 KOps/s $\color{#d91a1a}-0.46\%$
test_compile_assign_and_add_stack[eager] 3.5659ms 2.7483ms 363.8565 Ops/s 365.0800 Ops/s $\color{#d91a1a}-0.34\%$
test_compile_indexing[tensor-tensordict-compile] 84.4680μs 39.4271μs 25.3632 KOps/s 25.7811 KOps/s $\color{#d91a1a}-1.62\%$
test_compile_indexing[tensor-tensordict-eager] 0.8643ms 33.7203μs 29.6557 KOps/s 29.5305 KOps/s $\color{#35bf28}+0.42\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1511ms 31.9558μs 31.2932 KOps/s 32.6025 KOps/s $\color{#d91a1a}-4.02\%$
test_compile_indexing[tensor-tensorclass-eager] 70.1620μs 23.3136μs 42.8933 KOps/s 43.2053 KOps/s $\color{#d91a1a}-0.72\%$
test_compile_indexing[tensor-pytree-compile] 0.1159ms 32.7179μs 30.5643 KOps/s 31.2147 KOps/s $\color{#d91a1a}-2.08\%$
test_compile_indexing[tensor-pytree-eager] 79.2590μs 23.1074μs 43.2762 KOps/s 42.5863 KOps/s $\color{#35bf28}+1.62\%$
test_compile_indexing[slice-tensordict-compile] 0.1351ms 54.0414μs 18.5043 KOps/s 19.1993 KOps/s $\color{#d91a1a}-3.62\%$
test_compile_indexing[slice-tensordict-eager] 0.4215ms 20.2636μs 49.3495 KOps/s 49.0891 KOps/s $\color{#35bf28}+0.53\%$
test_compile_indexing[slice-tensorclass-compile] 98.4540μs 46.1990μs 21.6455 KOps/s 22.0180 KOps/s $\color{#d91a1a}-1.69\%$
test_compile_indexing[slice-tensorclass-eager] 68.7880μs 18.8399μs 53.0788 KOps/s 52.3173 KOps/s $\color{#35bf28}+1.46\%$
test_compile_indexing[slice-pytree-compile] 98.8760μs 46.7732μs 21.3798 KOps/s 21.5376 KOps/s $\color{#d91a1a}-0.73\%$
test_compile_indexing[slice-pytree-eager] 80.0000μs 20.1398μs 49.6529 KOps/s 52.2057 KOps/s $\color{#d91a1a}-4.89\%$
test_compile_indexing[int-tensordict-c 8000 ompile] 0.1203ms 55.4644μs 18.0296 KOps/s 18.5067 KOps/s $\color{#d91a1a}-2.58\%$
test_compile_indexing[int-tensordict-eager] 0.9967ms 20.4440μs 48.9142 KOps/s 49.9302 KOps/s $\color{#d91a1a}-2.03\%$
test_compile_indexing[int-tensorclass-compile] 91.9120μs 46.8485μs 21.3454 KOps/s 21.4303 KOps/s $\color{#d91a1a}-0.40\%$
test_compile_indexing[int-tensorclass-eager] 74.4790μs 18.7925μs 53.2126 KOps/s 53.0054 KOps/s $\color{#35bf28}+0.39\%$
test_compile_indexing[int-pytree-compile] 0.1564ms 47.3064μs 21.1388 KOps/s 21.4985 KOps/s $\color{#d91a1a}-1.67\%$
test_compile_indexing[int-pytree-eager] 87.3240μs 18.7456μs 53.3459 KOps/s 52.8994 KOps/s $\color{#35bf28}+0.84\%$
test_mod_add[eager] 0.1041ms 36.1178μs 27.6872 KOps/s 26.9940 KOps/s $\color{#35bf28}+2.57\%$
test_mod_add[compile] 0.1456ms 66.8901μs 14.9499 KOps/s 15.1505 KOps/s $\color{#d91a1a}-1.32\%$
test_mod_add[compile-overhead] 0.1285ms 64.3673μs 15.5358 KOps/s 14.8705 KOps/s $\color{#35bf28}+4.47\%$
test_mod_wrap[eager] 0.3863ms 0.2304ms 4.3408 KOps/s 4.3209 KOps/s $\color{#35bf28}+0.46\%$
test_mod_wrap[compile] 2.3844ms 0.2333ms 4.2868 KOps/s 4.2893 KOps/s $\color{#d91a1a}-0.06\%$
test_mod_wrap[compile-overhead] 0.3990ms 0.2263ms 4.4196 KOps/s 4.1809 KOps/s $\textbf{\color{#35bf28}+5.71\%}$
test_mod_wrap_and_backward[eager] 13.2430ms 11.4583ms 87.2729 Ops/s 77.4295 Ops/s $\textbf{\color{#35bf28}+12.71\%}$
test_mod_wrap_and_backward[compile] 12.8795ms 11.4385ms 87.4239 Ops/s 88.1442 Ops/s $\color{#d91a1a}-0.82\%$
test_mod_wrap_and_backward[compile-overhead] 12.2105ms 11.0527ms 90.4758 Ops/s 78.4703 Ops/s $\textbf{\color{#35bf28}+15.30\%}$
test_seq_add[eager] 0.2074ms 0.1216ms 8.2228 KOps/s 8.0949 KOps/s $\color{#35bf28}+1.58\%$
test_seq_add[compile] 0.1639ms 76.9513μs 12.9952 KOps/s 12.9020 KOps/s $\color{#35bf28}+0.72\%$
test_seq_add[compile-overhead] 0.1555ms 76.5971μs 13.0553 KOps/s 13.2877 KOps/s $\color{#d91a1a}-1.75\%$
test_seq_wrap[eager] 0.7388ms 0.4629ms 2.1602 KOps/s 2.1782 KOps/s $\color{#d91a1a}-0.83\%$
test_seq_wrap[compile] 0.3775ms 0.2441ms 4.0971 KOps/s 4.0954 KOps/s $\color{#35bf28}+0.04\%$
test_seq_wrap[compile-overhead] 0.3443ms 0.2413ms 4.1444 KOps/s 4.1386 KOps/s $\color{#35bf28}+0.14\%$
test_func_call_runtime[False-eager] 0.7269ms 0.5522ms 1.8108 KOps/s 1.8083 KOps/s $\color{#35bf28}+0.14\%$
test_func_call_runtime[False-compile] 0.7085ms 0.4440ms 2.2521 KOps/s 2.2540 KOps/s $\color{#d91a1a}-0.08\%$
test_func_call_runtime[False-compile-overhead] 0.5461ms 0.4433ms 2.2557 KOps/s 2.2672 KOps/s $\color{#d91a1a}-0.50\%$
test_func_call_runtime[True-eager] 1.2345ms 0.7701ms 1.2986 KOps/s 1.2980 KOps/s $\color{#35bf28}+0.05\%$
test_func_call_runtime[True-compile] 0.6354ms 0.4635ms 2.1576 KOps/s 2.1282 KOps/s $\color{#35bf28}+1.38\%$
test_func_call_runtime[True-compile-overhead] 0.5515ms 0.4686ms 2.1340 KOps/s 2.1600 KOps/s $\color{#d91a1a}-1.20\%$
test_func_call_cm_runtime[False-eager] 1.0773ms 0.5489ms 1.8219 KOps/s 1.8067 KOps/s $\color{#35bf28}+0.84\%$
test_func_call_cm_runtime[False-compile] 0.5602ms 0.4464ms 2.2401 KOps/s 2.2579 KOps/s $\color{#d91a1a}-0.79\%$
test_func_call_cm_runtime[False-compile-overhead] 0.7556ms 0.4510ms 2.2174 KOps/s 2.2620 KOps/s $\color{#d91a1a}-1.97\%$
test_func_call_cm_runtime[True-eager] 1.1196ms 0.9225ms 1.0840 KOps/s 1.0837 KOps/s $\color{#35bf28}+0.03\%$
test_func_call_cm_runtime[True-compile] 1.1596ms 0.8167ms 1.2245 KOps/s 1.2344 KOps/s $\color{#d91a1a}-0.80\%$
test_func_call_cm_runtime[True-compile-overhead] 1.2303ms 0.8246ms 1.2127 KOps/s 1.2286 KOps/s $\color{#d91a1a}-1.29\%$
test_vmap_func_call_cm_runtime[eager] 3.3584ms 1.9284ms 518.5612 Ops/s 523.9568 Ops/s $\color{#d91a1a}-1.03\%$
test_vmap_func_call_cm_runtime[compile] 0.9650ms 0.5361ms 1.8652 KOps/s 1.8491 KOps/s $\color{#35bf28}+0.87\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.6528ms 0.5326ms 1.8776 KOps/s 1.8630 KOps/s $\color{#35bf28}+0.78\%$
test_distributed 0.3142ms 0.1241ms 8.0590 KOps/s 7.7874 KOps/s $\color{#35bf28}+3.49\%$
test_tdmodule 90.4790μs 27.6661μs 36.1453 KOps/s 36.7329 KOps/s $\color{#d91a1a}-1.60\%$
test_tdmodule_dispatch 80.3900μs 50.4174μs 19.8344 KOps/s 19.3988 KOps/s $\color{#35bf28}+2.25\%$
test_tdseq 64.4100μs 31.3448μs 31.9032 KOps/s 33.5396 KOps/s $\color{#d91a1a}-4.88\%$
test_tdseq_dispatch 80.6010μs 54.3222μs 18.4087 KOps/s 18.0737 KOps/s $\color{#35bf28}+1.85\%$
test_instantiation_functorch 1.7503ms 1.5301ms 653.5659 Ops/s 662.3571 Ops/s $\color{#d91a1a}-1.33\%$
test_exec_functorch 0.3106ms 0.1825ms 5.4790 KOps/s 5.4755 KOps/s $\color{#35bf28}+0.06\%$
test_exec_functional_call 0.3707ms 0.1803ms 5.5467 KOps/s 5.6763 KOps/s $\color{#d91a1a}-2.28\%$
test_exec_td_decorator 0.4477ms 0.2379ms 4.2035 KOps/s 4.1711 KOps/s $\color{#35bf28}+0.78\%$
test_vmap_mlp_speed_decorator[True-True] 0.8035ms 0.6580ms 1.5199 KOps/s 1.4970 KOps/s $\color{#35bf28}+1.53\%$
test_vmap_mlp_speed_decorator[True-False] 0.9927ms 0.6596ms 1.5160 KOps/s 1.4920 KOps/s $\color{#35bf28}+1.61\%$
test_vmap_mlp_speed_decorator[False-True] 0.9493ms 0.5346ms 1.8705 KOps/s 1.8785 KOps/s $\color{#d91a1a}-0.42\%$
test_vmap_mlp_speed_decorator[False-False] 1.2954ms 0.5434ms 1.8402 KOps/s 1.8711 KOps/s $\color{#d91a1a}-1.65\%$
test_to_module_speed[True] 1.9583ms 1.3491ms 741.2379 Ops/s 759.5928 Ops/s $\color{#d91a1a}-2.42\%$
test_to_module_speed[False] 2.1719ms 1.3182ms 758.6313 Ops/s 779.5956 Ops/s $\color{#d91a1a}-2.69\%$
test_tc_init 91.2810μs 45.6039μs 21.9279 KOps/s 22.5244 KOps/s $\color{#d91a1a}-2.65\%$
test_tc_init_nested 0.1635ms 91.4916μs 10.9300 KOps/s 11.0721 KOps/s $\color{#d91a1a}-1.28\%$
test_tc_first_layer_tensor 22.8730μs 1.5090μs 662.6701 KOps/s 659.0738 KOps/s $\color{#35bf28}+0.55\%$
test_tc_first_layer_nontensor 27.3710μs 4.7023μs 212.6637 KOps/s 214.9562 KOps/s $\color{#d91a1a}-1.07\%$
test_tc_second_layer_tensor 23.0730μs 2.8360μs 352.6117 KOps/s 355.3047 KOps/s $\color{#d91a1a}-0.76\%$
test_tc_second_layer_nontensor 45.1740μs 6.0121μs 166.3322 KOps/s 167.7909 KOps/s $\color{#d91a1a}-0.87\%$
test_unbind 0.2233s 12.8990ms 77.5254 Ops/s 77.9915 Ops/s $\color{#d91a1a}-0.60\%$
test_full_like 9.0403ms 8.5046ms 117.5828 Ops/s 130.7495 Ops/s $\textbf{\color{#d91a1a}-10.07\%}$
test_zeros_like 5.3129ms 2.5881ms 386.3798 Ops/s 361.3585 Ops/s $\textbf{\color{#35bf28}+6.92\%}$
test_ones_like 4.4262ms 3.0932ms 323.2939 Ops/s 284.3036 Ops/s $\textbf{\color{#35bf28}+13.71\%}$
test_clone 9.4661ms 6.2015ms 161.2519 Ops/s 197.4588 Ops/s $\textbf{\color{#d91a1a}-18.34\%}$
test_squeeze 58.5700μs 12.7804μs 78.2446 KOps/s 77.8769 KOps/s $\color{#35bf28}+0.47\%$
test_unsqueeze 0.1645ms 96.0532μs 10.4109 KOps/s 10.8008 KOps/s $\color{#d91a1a}-3.61\%$
test_split 0.4354ms 0.1964ms 5.0918 KOps/s 5.1707 KOps/s $\color{#d91a1a}-1.53\%$
test_permute 0.2960ms 0.2012ms 4.9695 KOps/s 4.9378 KOps/s $\color{#35bf28}+0.64\%$
test_stack 31.4263ms 23.5216ms 42.5142 Ops/s 40.7310 Ops/s $\color{#35bf28}+4.38\%$
test_cat 31.2299ms 23.6839ms 42.2228 Ops/s 41.1116 Ops/s $\color{#35bf28}+2.70\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}5$. Worsened: $\large\color{#d91a1a}22$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.1497ms 12.9662μs 77.1234 KOps/s 80.3384 KOps/s $\color{#d91a1a}-4.00\%$
test_plain_set_stack_nested 43.9910μs 13.0501μs 76.6280 KOps/s 79.2891 KOps/s $\color{#d91a1a}-3.36\%$
test_plain_set_nested_inplace 84.1210μs 14.0473μs 71.1882 KOps/s 73.7594 KOps/s $\color{#d91a1a}-3.49\%$
test_plain_set_stack_nested_inplace 42.9300μs 13.8740μs 72.0773 KOps/s 74.5861 KOps/s $\color{#d91a1a}-3.36\%$
test_items 41.3110μs 2.8471μs 351.2320 KOps/s 349.7694 KOps/s $\color{#35bf28}+0.42\%$
test_items_nested 0.5381ms 0.3644ms 2.7439 KOps/s 2.7881 KOps/s $\color{#d91a1a}-1.58\%$
test_items_nested_locked 0.4057ms 0.3701ms 2.7019 KOps/s 2.7720 KOps/s $\color{#d91a1a}-2.53\%$
test_items_nested_leaf 0.1365ms 60.7027μs 16.4737 KOps/s 16.6370 KOps/s $\color{#d91a1a}-0.98\%$
test_items_stack_nested 0.4143ms 0.3643ms 2.7449 KOps/s 2.7795 KOps/s $\color{#d91a1a}-1.25\%$
test_items_stack_nested_leaf 0.1051ms 60.5414μs 16.5176 KOps/s 16.6384 KOps/s $\color{#d91a1a}-0.73\%$
test_items_stack_nested_locked 0.4445ms 0.3648ms 2.7409 KOps/s 2.7953 KOps/s $\color{#d91a1a}-1.95\%$
test_keys 33.0000μs 3.4286μs 291.6604 KOps/s 293.8900 KOps/s $\color{#d91a1a}-0.76\%$
test_keys_nested 0.1493ms 87.2589μs 11.4601 KOps/s 11.3864 KOps/s $\color{#35bf28}+0.65\%$
test_keys_nested_locked 0.7861ms 93.3730μs 10.7097 KOps/s 10.7749 KOps/s $\color{#d91a1a}-0.60\%$
test_keys_nested_leaf 0.1148ms 78.7519μs 12.6981 KOps/s 12.7842 KOps/s $\color{#d91a1a}-0.67\%$
test_keys_stack_nested 0.1301ms 87.5512μs 11.4219 KOps/s 11.5090 KOps/s $\color{#d91a1a}-0.76\%$
test_keys_stack_nested_leaf 0.1190ms 78.6285μs 12.7180 KOps/s 12.8160 KOps/s $\color{#d91a1a}-0.76\%$
test_keys_stack_nested_locked 0.1419ms 93.4313μs 10.7030 KOps/s 10.7982 KOps/s $\color{#d91a1a}-0.88\%$
test_values 4.7333μs 0.8508μs 1.1754 MOps/s 1.1699 MOps/s $\color{#35bf28}+0.47\%$
test_values_nested 66.2400μs 37.0609μs 26.9826 KOps/s 27.2198 KOps/s $\color{#d91a1a}-0.87\%$
test_values_nested_locked 67.2410μs 38.9211μs 25.6930 KOps/s 25.7892 KOps/s $\color{#d91a1a}-0.37\%$
test_values_nested_leaf 93.0910μs 42.5401μs 23.5072 KOps/s 23.9832 KOps/s $\color{#d91a1a}-1.98\%$
test_values_stack_nested 74.1710μs 37.1366μs 26.9276 KOps/s 27.0046 KOps/s $\color{#d91a1a}-0.28\%$
test_values_stack_nested_leaf 75.0810μs 42.3829μs 23.5944 KOps/s 23.7289 KOps/s $\color{#d91a1a}-0.57\%$
test_values_stack_nested_locked 76.5110μs 39.3227μs 25.4306 KOps/s 25.7223 KOps/s $\color{#d91a1a}-1.13\%$
test_membership 1.6900μs 0.5003μs 1.9987 MOps/s 1.9924 MOps/s $\color{#35bf28}+0.32\%$
test_membership_nested 30.0905μs 1.9648μs 508.9628 KOps/s 488.5372 KOps/s $\color{#35bf28}+4.18\%$
test_membership_nested_leaf 27.9350μs 1.9789μs 505.3409 KOps/s 507.2284 KOps/s $\color{#d91a1a}-0.37\%$
test_membership_stacked_nested 35.5800μs 2.0774μs 481.3819 KOps/s 489.8336 KOps/s $\color{#d91a1a}-1.73\%$
test_membership_stacked_nested_leaf 23.7400μs 2.0560μs 486.3792 KOps/s 492.0336 KOps/s $\color{#d91a1a}-1.15\%$
test_membership_nested_last 75.3610μs 2.9970μs 333.6628 KOps/s 327.0534 KOps/s $\color{#35bf28}+2.02\%$
test_membership_nested_leaf_last 26.4500μs 3.0321μs 329.8061 KOps/s 328.3819 KOps/s $\color{#35bf28}+0.43\%$
test_membership_stacked_nested_last 46.9900μs 3.0129μs 331.9027 KOps/s 329.8205 KOps/s $\color{#35bf28}+0.63\%$
test_membership_stacked_nested_leaf_last 39.3310μs 3.0132μs 331.8702 KOps/s 335.9681 KOps/s $\color{#d91a1a}-1.22\%$
test_nested_getleaf 76.8310μs 6.1696μs 162.0838 KOps/s 161.1178 KOps/s $\color{#35bf28}+0.60\%$
test_nested_get 30.5510μs 5.8447μs 171.0960 KOps/s 169.2782 KOps/s $\color{#35bf28}+1.07\%$
test_stacked_getleaf 47.6810μs 6.0654μs 164.8700 KOps/s 163.1100 KOps/s $\color{#35bf28}+1.08\%$
test_stacked_get 28.1010μs 5.7306μs 174.5013 KOps/s 175.7378 KOps/s $\color{#d91a1a}-0.70\%$
test_nested_getitemleaf 45.2210μs 6.3663μs 157.0766 KOps/s 157.4891 KOps/s $\color{#d91a1a}-0.26\%$
test_nested_getitem 32.8110μs 6.0408μs 165.5396 KOps/s 165.3038 KOps/s $\color{#35bf28}+0.14\%$
test_stacked_getitemleaf 45.4010μs 6.3042μs 158.6254 KOps/s 158.7933 KOps/s $\color{#d91a1a}-0.11\%$
test_stacked_getitem 0.3682ms 5.9361μs 168.4607 KOps/s 168.5020 KOps/s $\color{#d91a1a}-0.02\%$
test_lock_nested 9.8100ms 0.3448ms 2.9003 KOps/s 2.9036 KOps/s $\color{#d91a1a}-0.11\%$
test_lock_stack_nested 0.4774ms 0.3446ms 2.9019 KOps/s 2.9219 KOps/s $\color{#d91a1a}-0.68\%$
test_unlock_nested 0.4278ms 0.2801ms 3.5698 KOps/s 3.5470 KOps/s $\color{#35bf28}+0.64\%$
test_unlock_stack_nested 0.3389ms 0.2811ms 3.5578 KOps/s 3.5374 KOps/s $\color{#35bf28}+0.58\%$
test_flatten_speed 0.1159ms 78.8309μs 12.6854 KOps/s 13.0950 KOps/s $\color{#d91a1a}-3.13\%$
test_unflatten_speed 0.3876ms 0.3211ms 3.1142 KOps/s 3.1425 KOps/s $\color{#d91a1a}-0.90\%$
test_common_ops 0.8325ms 0.6199ms 1.6132 KOps/s 1.6721 KOps/s $\color{#d91a1a}-3.52\%$
test_creation 0.1210ms 1.7132μs 583.6934 KOps/s 586.0269 KOps/s $\color{#d91a1a}-0.40\%$
test_creation_empty 47.0600μs 9.3540μs 106.9057 KOps/s 120.1582 KOps/s $\textbf{\color{#d91a1a}-11.03\%}$
test_creation_nested_1 44.7710μs 11.0055μs 90.8635 KOps/s 99.8041 KOps/s $\textbf{\color{#d91a1a}-8.96\%}$
test_creation_nested_2 41.6110μs 13.7491μs 72.7319 KOps/s 79.4605 KOps/s $\textbf{\color{#d91a1a}-8.47\%}$
test_clone 54.5510μs 10.0742μs 99.2632 KOps/s 98.4947 KOps/s $\color{#35bf28}+0.78\%$
test_getitem[int] 1.5564ms 10.9242μs 91.5402 KOps/s 95.2119 KOps/s $\color{#d91a1a}-3.86\%$
test_getitem[slice_int] 0.1605ms 20.8111μs 48.0512 KOps/s 48.7478 KOps/s $\color{#d91a1a}-1.43\%$
test_getitem[range] 0.2359ms 37.5799μs 26.6100 KOps/s 28.1728 KOps/s $\textbf{\color{#d91a1a}-5.55\%}$
test_getitem[tuple] 0.1134ms 18.2743μs 54.7216 KOps/s 55.6594 KOps/s $\color{#d91a1a}-1.68\%$
test_getitem[list] 0.1311ms 32.4310μs 30.8347 KOps/s 31.8889 KOps/s $\color{#d91a1a}-3.31\%$
test_setitem_dim[int] 37.5210μs 18.5441μs 53.9256 KOps/s 52.8625 KOps/s $\color{#35bf28}+2.01\%$
test_setitem_dim[slice_int] 67.4810μs 37.8243μs 26.4380 KOps/s 27.2590 KOps/s $\color{#d91a1a}-3.01\%$
test_setitem_dim[range] 74.0510μs 51.5480μs 19.3994 KOps/s 19.6695 KOps/s $\color{#d91a1a}-1.37\%$
test_setitem_dim[tuple] 52.1400μs 31.6235μs 31.6220 KOps/s 32.1856 KOps/s $\color{#d91a1a}-1.75\%$
test_setitem 60.2610μs 15.1027μs 66.2132 KOps/s 70.2951 KOps/s $\textbf{\color{#d91a1a}-5.81\%}$
test_set 0.2036ms 14.5439μs 68.7576 KOps/s 72.4803 KOps/s $\textbf{\color{#d91a1a}-5.14\%}$
test_set_shared 0.5918ms 0.1564ms 6.3938 KOps/s 6.4774 KOps/s $\color{#d91a1a}-1.29\%$
test_update 0.4447ms 18.2179μs 54.8911 KOps/s 58.9333 KOps/s $\textbf{\color{#d91a1a}-6.86\%}$
test_update_nested 66.4410μs 23.5221μs 42.5132 KOps/s 44.2692 KOps/s $\color{#d91a1a}-3.97\%$
test_update__nested 0.5390ms 24.6247μs 40.6097 KOps/s 42.5987 KOps/s $\color{#d91a1a}-4.67\%$
test_set_nested 0.1011ms 15.7873μs 63.3422 KOps/s 64.7483 KOps/s $\color{#d91a1a}-2.17\%$
test_set_nested_new 0.1045ms 17.9405μs 55.7397 KOps/s 56.4229 KOps/s $\color{#d91a1a}-1.21\%$
test_select 70.5810μs 30.0835μs 33.2408 KOps/s 35.0985 KOps/s $\textbf{\color{#d91a1a}-5.29\%}$
test_select_nested 79.7510μs 42.9415μs 23.2875 KOps/s 23.2953 KOps/s $\color{#d91a1a}-0.03\%$
test_exclude_nested 98.8510μs 62.1863μs 16.0807 KOps/s 16.1512 KOps/s $\color{#d91a1a}-0.44\%$
test_empty[True] 0.4023ms 0.2919ms 3.4256 KOps/s 3.4557 KOps/s $\color{#d91a1a}-0.87\%$
test_empty[False] 4.4311μs 0.8125μs 1.2308 MOps/s 1.2366 MOps/s $\color{#d91a1a}-0.47\%$
test_to 0.1175ms 60.1779μs 16.6174 KOps/s 17.9979 KOps/s $\textbf{\color{#d91a1a}-7.67\%}$
test_to_nonblocking 0.1094ms 45.4820μs 21.9867 KOps/s 22.1692 KOps/s $\color{#d91a1a}-0.82\%$
test_unbind_speed 0.3074ms 0.2408ms 4.1536 KOps/s 4.1803 KOps/s $\color{#d91a1a}-0.64\%$
test_unbind_speed_stack0 0.3179ms 0.2350ms 4.2549 KOps/s 4.2821 KOps/s $\color{#d91a1a}-0.64\%$
test_unbind_speed_stack1 0.1123s 0.7481ms 1.3367 KOps/s 1.3309 KOps/s $\color{#35bf28}+0.43\%$
test_split 0.1043s 1.6213ms 616.8056 Ops/s 621.5910 Ops/s $\color{#d91a1a}-0.77\%$
test_chunk 0.1151s 1.6377ms 610.6076 Ops/s 622.4754 Ops/s $\color{#d91a1a}-1.91\%$
test_consolidate[False-None] 3.1330ms 2.7601ms 362.3096 Ops/s 332.3167 Ops/s $\textbf{\color{#35bf28}+9.03\%}$
test_consolidate[default-None] 1.8989ms 1.7452ms 573.0103 Ops/s 590.8742 Ops/s $\color{#d91a1a}-3.02\%$
test_consolidate[reduce-overhead-None] 1.9029ms 1.7549ms 569.8191 Ops/s 579.2034 Ops/s $\color{#d91a1a}-1.62\%$
test_consolidate_njt[False-None] 6.8521ms 6.5568ms 152.5126 Ops/s 154.9531 Ops/s $\color{#d91a1a}-1.58\%$
test_to[False-False-None] 1.8798ms 1.7002ms 588.1541 Ops/s 599.0993 Ops/s $\color{#d91a1a}-1.83\%$
test_to[True-False-None] 1.5357ms 1.3423ms 744.9772 Ops/s 755.4403 Ops/s $\color{#d91a1a}-1.39\%$
test_to[within-False-None] 4.4795ms 4.2157ms 237.2110 Ops/s 241.1473 Ops/s $\color{#d91a1a}-1.63\%$
test_to[True-default-None] 5.8201ms 5.2689ms 189.7942 Ops/s 195.0482 Ops/s $\color{#d91a1a}-2.69\%$
test_to_njt[False-False-None] 7.3226ms 6.9302ms 144.2953 Ops/s 147.7863 Ops/s $\color{#d91a1a}-2.36\%$
test_to_njt[True-False-None] 5.9120ms 5.5844ms 179.0708 Ops/s 186.3449 Ops/s $\color{#d91a1a}-3.90\%$
test_to_njt[within-False-None] 12.5549ms 12.1121ms 82.5624 Ops/s 84.3894 Ops/s $\color{#d91a1a}-2.16\%$
test_creation[device0] 0.6560ms 82.2385μs 12.1598 KOps/s 12.7500 KOps/s $\color{#d91a1a}-4.63\%$
test_creation_from_tensor 0.6442ms 87.4978μs 11.4289 KOps/s 12.2424 KOps/s $\textbf{\color{#d91a1a}-6.65\%}$
test_add_one[memmap_tensor0] 0.4607ms 6.5367μs 152.9818 KOps/s 158.9742 KOps/s $\color{#d91a1a}-3.77\%$
test_contiguous[memmap_tensor0] 1.8850μs 0.4243μs 2.3566 MOps/s 2.3758 MOps/s $\color{#d91a1a}-0.81\%$
test_stack[memmap_tensor0] 0.1200ms 4.5723μs 218.7065 KOps/s 216.7132 KOps/s $\color{#35bf28}+0.92\%$
test_memmaptd_index 1.7278ms 0.2413ms 4.1437 KOps/s 4.1869 KOps/s $\color{#d91a1a}-1.03\%$
test_memmaptd_index_astensor 0.4355ms 0.3060ms 3.2682 KOps/s 3.3110 KOps/s $\color{#d91a1a}-1.29\%$
test_memmaptd_index_op 0.7241ms 0.5876ms 1.7018 KOps/s 1.8076 KOps/s $\textbf{\color{#d91a1a}-5.85\%}$
test_serialize_model 0.1331s 0.1316s 7.6015 Ops/s 7.6146 Ops/s $\color{#d91a1a}-0.17\%$
test_serialize_model_pickle 1.6895s 1.3236s 0.7555 Ops/s 0.8247 Ops/s $\textbf{\color{#d91a1a}-8.40\%}$
test_serialize_weights 0.1325s 0.1305s 7.6652 Ops/s 7.6237 Ops/s $\color{#35bf28}+0.54\%$
test_serialize_weights_returnearly 0.3937s 58.7566ms 17.0194 Ops/s 10.3987 Ops/s $\textbf{\color{#35bf28}+63.67\%}$
test_serialize_weights_pickle 1.7219s 1.3408s 0.7458 Ops/s 0.8222 Ops/s $\textbf{\color{#d91a1a}-9.28\%}$
test_reshape_pytree 53.2710μs 22.6195μs 44.2096 KOps/s 45.8115 KOps/s $\color{#d91a1a}-3.50\%$
test_reshape_td 60.0110μs 26.9519μs 37.1031 KOps/s 36.8953 KOps/s $\color{#35bf28}+0.56\%$
test_view_pytree 51.0100μs 22.4178μs 44.6075 KOps/s 45.7788 KOps/s $\color{#d91a1a}-2.56\%$
test_view_td 85.8210μs 31.5829μs 31.6627 KOps/s 30.0473 KOps/s $\textbf{\color{#35bf28}+5.38\%}$
test_unbind_pytree 88.4110μs 28.2425μs 35.4076 KOps/s 35.7386 KOps/s $\color{#d91a1a}-0.93\%$
test_unbind_td 0.7335ms 36.1013μs 27.6998 KOps/s 27.0454 KOps/s $\color{#35bf28}+2.42\%$
test_split_pytree 0.1651ms 30.1994μs 33.1132 KOps/s 33.7320 KOps/s $\color{#d91a1a}-1.83\%$
test_split_td 0.8421ms 39.8100μs 25.1193 KOps/s 25.5686 KOps/s $\color{#d91a1a}-1.76\%$
test_add_pytree 0.1201ms 33.6503μs 29.7174 KOps/s 30.3084 KOps/s $\color{#d91a1a}-1.95\%$
test_add_td 0.1836ms 51.8403μs 19.2900 KOps/s 21.9975 KOps/s $\textbf{\color{#d91a1a}-12.31\%}$
test_compile_add_one_nested[tensordict-compile] 0.1919ms 0.1210ms 8.2637 KOps/s 8.0277 KOps/s $\color{#35bf28}+2.94\%$
test_compile_add_one_nested[tensordict-eager] 0.2825ms 0.1335ms 7.4893 KOps/s 7.4759 KOps/s $\color{#35bf28}+0.18\%$
test_compile_add_one_nested[pytree-compile] 0.1432ms 94.8856μs 10.5390 KOps/s 10.2608 KOps/s $\color{#35bf28}+2.71\%$
test_compile_add_one_nested[pytree-eager] 1.4928ms 0.1446ms 6.9134 KOps/s 6.8500 KOps/s $\color{#35bf28}+0.93\%$
test_compile_copy_nested[tensordict-compile] 0.1441ms 24.1002μs 41.4934 KOps/s 43.5114 KOps/s $\color{#d91a1a}-4.64\%$
test_compile_copy_nested[tensordict-eager] 0.1595ms 29.6398μs 33.7385 KOps/s 34.9812 KOps/s $\color{#d91a1a}-3.55\%$
test_compile_copy_nested[pytree-compile] 0.3767ms 63.6600μs 15.7085 KOps/s 15.4352 KOps/s $\color{#35bf28}+1.77\%$
test_compile_copy_nested[pytree-eager] 0.1870ms 48.9061μs 20.4473 KOps/s 20.2693 KOps/s $\color{#35bf28}+0.88\%$
test_compile_add_one_flat[tensordict-compile] 0.1873ms 0.1413ms 7.0780 KOps/s 7.0865 KOps/s $\color{#d91a1a}-0.12\%$
test_compile_add_one_flat[tensordict-eager] 0.3470ms 0.2181ms 4.5840 KOps/s 4.6606 KOps/s $\color{#d91a1a}-1.64\%$
test_compile_add_one_flat[tensorclass-compile] 0.2078ms 96.3937μs 10.3741 KOps/s 10.3709 KOps/s $\color{#35bf28}+0.03\%$
test_compile_add_one_flat[tensorclass-eager] 0.3067ms 55.2034μs 18.1148 KOps/s 18.0487 KOps/s $\color{#35bf28}+0.37\%$
test_compile_add_one_flat[pytree-compile] 0.2700ms 0.1353ms 7.3933 KOps/s 7.3652 KOps/s $\color{#35bf28}+0.38\%$
test_compile_add_one_flat[pytree-eager] 0.6303ms 0.4730ms 2.1142 KOps/s 2.1297 KOps/s $\color{#d91a1a}-0.73\%$
test_compile_add_self_flat[tensordict-eager] 0.4016ms 0.2620ms 3.8165 KOps/s 3.8309 KOps/s $\color{#d91a1a}-0.38\%$
test_compile_add_self_flat[tensordict-compile] 0.2683ms 0.1423ms 7.0294 KOps/s 7.1829 KOps/s $\color{#d91a1a}-2.14\%$
test_compile_add_self_flat[tensorclass-eager] 0.2182ms 66.4795μs 15.0422 KOps/s 14.6427 KOps/s $\color{#35bf28}+2.73\%$
test_compile_add_self_flat[tensorclass-compile] 0.2297ms 0.1021ms 9.7975 KOps/s 10.3394 KOps/s $\textbf{\color{#d91a1a}-5.24\%}$
test_compile_add_self_flat[pytree-eager] 0.5591ms 0.3932ms 2.5431 KOps/s 2.5366 KOps/s $\color{#35bf28}+0.26\%$
test_compile_add_self_flat[pytree-compile] 0.2809ms 0.1346ms 7.4285 KOps/s 7.4739 KOps/s $\color{#d91a1a}-0.61\%$
test_compile_copy_flat[tensordict-compile] 60.3410μs 18.9824μs 52.6804 KOps/s 56.1928 KOps/s $\textbf{\color{#d91a1a}-6.25\%}$
test_compile_copy_flat[tensordict-eager] 0.2614ms 31.6702μs 31.5754 KOps/s 32.1791 KOps/s $\color{#d91a1a}-1.88\%$
test_compile_copy_flat[pytree-compile] 0.2290ms 69.6800μs 14.3513 KOps/s 14.3424 KOps/s $\color{#35bf28}+0.06\%$
test_compile_copy_flat[pytree-eager] 0.1011ms 51.9476μs 19.2502 KOps/s 19.1567 KOps/s $\color{#35bf28}+0.49\%$
test_compile_assign_and_add[tensordict-compile] 1.6589ms 0.4523ms 2.2108 KOps/s 2.2039 KOps/s $\color{#35bf28}+0.31\%$
test_compile_assign_and_add[tensordict-eager] 2.7870ms 2.5616ms 390.3783 Ops/s 394.9628 Ops/s $\color{#d91a1a}-1.16\%$
test_compile_assign_and_add[pytree-compile] 1.8012ms 0.4732ms 2.1131 KOps/s 2.1689 KOps/s $\color{#d91a1a}-2.57\%$
test_compile_assign_and_add[pytree-eager] 2.8094ms 2.5388ms 393.8877 Ops/s 396.7864 Ops/s $\color{#d91a1a}-0.73\%$
test_compile_indexing[tensor-tensordict-compile] 0.2994ms 0.1160ms 8.6198 KOps/s 9.0531 KOps/s $\color{#d91a1a}-4.79\%$
test_compile_indexing[tensor-tensordict-eager] 0.5881ms 79.2719μs 12.6148 KOps/s 12.7535 KOps/s $\color{#d91a1a}-1.09\%$
test_compile_indexing[tensor-tensorclass-compile] 0.2751ms 0.1093ms 9.1467 KOps/s 9.4830 KOps/s $\color{#d91a1a}-3.55\%$
test_compile_indexing[tensor-tensorclass-eager] 0.2490ms 66.1270μs 15.1224 KOps/s 14.9495 KOps/s $\color{#35bf28}+1.16\%$
test_compile_indexing[tensor-pytree-compile] 0.2729ms 0.1092ms 9.1599 KOps/s 9.5870 KOps/s $\color{#d91a1a}-4.45\%$
test_compile_indexing[tensor-pytree-eager] 0.2387ms 67.1792μs 14.8856 KOps/s 15.2900 KOps/s $\color{#d91a1a}-2.65\%$
test_compile_indexing[slice-tensordict-compile] 0.1646ms 0.1023ms 9.7766 KOps/s 9.9521 KOps/s $\color{#d91a1a}-1.76\%$
test_compile_indexing[slice-tensordict-eager] 0.1545ms 17.5183μs 57.0831 KOps/s 59.7687 KOps/s $\color{#d91a1a}-4.49\%$
test_compile_indexing[slice-tensorclass-compile] 0.2512ms 97.9367μs 10.2107 KOps/s 10.3395 KOps/s $\color{#d91a1a}-1.25\%$
test_compile_indexing[slice-tensorclass-eager] 0.1505ms 15.9954μs 62.5178 KOps/s 63.9040 KOps/s $\color{#d91a1a}-2.17\%$
test_compile_indexing[slice-pytree-compile] 0.2468ms 99.4508μs 10.0552 KOps/s 10.3848 KOps/s $\color{#d91a1a}-3.17\%$
test_compile_indexing[slice-pytree-eager] 0.1537ms 15.9475μs 62.7058 KOps/s 64.6558 KOps/s $\color{#d91a1a}-3.02\%$
test_compile_indexing[int-tensordict-compile] 0.2547ms 0.1028ms 9.7238 KOps/s 9.9630 KOps/s $\color{#d91a1a}-2.40\%$
test_compile_indexing[int-tensordict-eager] 0.8064ms 17.1790μs 58.2105 KOps/s 59.2270 KOps/s $\color{#d91a1a}-1.72\%$
test_compile_indexing[int-tensorclass-compile] 0.2759ms 98.7651μs 10.1250 KOps/s 10.4060 KOps/s $\color{#d91a1a}-2.70\%$
test_compile_indexing[int-tensorclass-eager] 0.1911ms 15.8209μs 63.2074 KOps/s 64.3994 KOps/s $\color{#d91a1a}-1.85\%$
test_compile_indexing[int-pytree-compile] 0.2478ms 97.5479μs 10.2514 KOps/s 10.2425 KOps/s $\color{#35bf28}+0.09\%$
test_compile_indexing[int-pytree-eager] 53.9000μs 16.5513μs 60.4180 KOps/s 63.0997 KOps/s $\color{#d91a1a}-4.25\%$
test_mod_add[eager] 0.1975ms 38.7223μs 25.8249 KOps/s 27.3720 KOps/s $\textbf{\color{#d91a1a}-5.65\%}$
test_mod_add[compile] 0.2431ms 83.7092μs 11.9461 KOps/s 12.5151 KOps/s $\color{#d91a1a}-4.55\%$
test_mod_add[compile-overhead] 0.3421ms 0.1701ms 5.8804 KOps/s 5.6330 KOps/s $\color{#35bf28}+4.39\%$
test_mod_wrap[eager] 0.4334ms 0.2556ms 3.9124 KOps/s 4.0693 KOps/s $\color{#d91a1a}-3.86\%$
test_mod_wrap[compile] 0.4282ms 0.2849ms 3.5097 KOps/s 3.5087 KOps/s $\color{#35bf28}+0.03\%$
test_mod_wrap[compile-overhead] 7.3942ms 3.9079ms 255.8935 Ops/s 261.9192 Ops/s $\color{#d91a1a}-2.30\%$
test_mod_wrap_and_backward[eager] 1.5433ms 1.3228ms 756.0002 Ops/s 745.2710 Ops/s $\color{#35bf28}+1.44\%$
test_mod_wrap_and_backward[compile] 1.5782ms 1.2557ms 796.3944 Ops/s 791.4436 Ops/s $\color{#35bf28}+0.63\%$
test_mod_wrap_and_backward[compile-overhead] 1.4349ms 0.9308ms 1.0743 KOps/s 1.0661 KOps/s $\color{#35bf28}+0.78\%$
test_seq_add[eager] 0.5283ms 0.1149ms 8.7051 KOps/s 8.6257 KOps/s $\color{#35bf28}+0.92\%$
test_seq_add[compile] 0.4863ms 90.3459μs 11.0686 KOps/s 11.0161 KOps/s $\color{#35bf28}+0.48\%$
test_seq_add[compile-overhead] 0.2796ms 0.1286ms 7.7768 KOps/s 7.5874 KOps/s $\color{#35bf28}+2.50\%$
test_seq_wrap[eager] 0.8433ms 0.4342ms 2.3031 KOps/s 2.2568 KOps/s $\color{#35bf28}+2.05\%$
test_seq_wrap[compile] 0.7144ms 0.3005ms 3.3275 KOps/s 3.1546 KOps/s $\textbf{\color{#35bf28}+5.48\%}$
test_seq_wrap[compile-overhead] 0.4234ms 0.2253ms 4.4389 KOps/s 4.3962 KOps/s $\color{#35bf28}+0.97\%$
test_func_call_runtime[False-eager] 0.8905ms 0.7237ms 1.3818 KOps/s 1.3118 KOps/s $\textbf{\color{#35bf28}+5.34\%}$
test_func_call_runtime[False-compile] 0.8983ms 0.7447ms 1.3428 KOps/s 1.3354 KOps/s $\color{#35bf28}+0.55\%$
test_func_call_runtime[False-compile-overhead] 0.5036ms 0.3675ms 2.7207 KOps/s 2.7417 KOps/s $\color{#d91a1a}-0.77\%$
test_func_call_runtime[True-eager] 1.0807ms 0.8981ms 1.1135 KOps/s 1.1348 KOps/s $\color{#d91a1a}-1.88\%$
test_func_call_runtime[True-compile] 0.9075ms 0.7695ms 1.2995 KOps/s 1.3016 KOps/s $\color{#d91a1a}-0.16\%$
test_func_call_runtime[True-compile-overhead] 0.5463ms 0.3921ms 2.5506 KOps/s 2.5862 KOps/s $\color{#d91a1a}-1.38\%$
test_func_call_cm_runtime[False-eager] 0.9012ms 0.7390ms 1.3532 KOps/s 1.3966 KOps/s $\color{#d91a1a}-3.11\%$
test_func_call_cm_runtime[False-compile] 0.9332ms 0.7569ms 1.3211 KOps/s 1.3362 KOps/s $\color{#d91a1a}-1.13\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5079ms 0.3666ms 2.7274 KOps/s 2.7185 KOps/s $\color{#35bf28}+0.33\%$
test_func_call_cm_runtime[True-eager] 1.2109ms 0.9895ms 1.0107 KOps/s 1.0036 KOps/s $\color{#35bf28}+0.70\%$
test_func_call_cm_runtime[True-compile] 1.1279ms 0.9639ms 1.0375 KOps/s 1.0199 KOps/s $\color{#35bf28}+1.72\%$
test_func_call_cm_runtime[True-compile-overhead] 1.1896ms 0.9886ms 1.0116 KOps/s 981.6685 Ops/s $\color{#35bf28}+3.05\%$
test_vmap_func_call_cm_runtime[eager] 2.4634ms 2.0207ms 494.8853 Ops/s 489.6128 Ops/s $\color{#35bf28}+1.08\%$
test_vmap_func_call_cm_runtime[compile] 0.9617ms 0.8080ms 1.2377 KOps/s 1.1880 KOps/s $\color{#35bf28}+4.18\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4880ms 0.4175ms 2.3953 KOps/s 2.3809 KOps/s $\color{#35bf28}+0.60\%$
test_distributed 1.9762ms 0.2582ms 3.8736 KOps/s 8.6804 KOps/s $\textbf{\color{#d91a1a}-55.38\%}$
test_tdmodule 0.4332ms 21.1979μs 47.1745 KOps/s 49.6230 KOps/s $\color{#d91a1a}-4.93\%$
test_tdmodule_dispatch 65.4910μs 35.8332μs 27.9071 KOps/s 27.4717 KOps/s $\color{#35bf28}+1.58\%$
test_tdseq 62.8410μs 20.7996μs 48.0778 KOps/s 48.6259 KOps/s $\color{#d91a1a}-1.13\%$
test_tdseq_dispatch 98.7910μs 39.8571μs 25.0896 KOps/s 26.2011 KOps/s $\color{#d91a1a}-4.24\%$
test_instantiation_functorch 1.6511ms 1.5170ms 659.1810 Ops/s 655.5841 Ops/s $\color{#35bf28}+0.55\%$
test_exec_functorch 0.1833ms 0.1374ms 7.2801 KOps/s 7.2727 KOps/s $\color{#35bf28}+0.10\%$
test_exec_functional_call 0.2898ms 0.1322ms 7.5625 KOps/s 7.7494 KOps/s $\color{#d91a1a}-2.41\%$
test_exec_td_decorator 0.3680ms 0.1794ms 5.5747 KOps/s 5.5272 KOps/s $\color{#35bf28}+0.86\%$
test_vmap_mlp_speed_decorator[True-True] 0.8216ms 0.6669ms 1.4995 KOps/s 1.4955 KOps/s $\color{#35bf28}+0.26\%$
test_vmap_mlp_speed_decorator[True-False] 0.8257ms 0.6665ms 1.5004 KOps/s 1.4936 KOps/s $\color{#35bf28}+0.45\%$
test_vmap_mlp_speed_decorator[False-True] 0.7509ms 0.5713ms 1.7503 KOps/s 1.7357 KOps/s $\color{#35bf28}+0.84\%$
test_vmap_mlp_speed_decorator[False-False] 0.7213ms 0.5735ms 1.7437 KOps/s 1.7125 KOps/s $\color{#35bf28}+1.82\%$
test_vmap_transformer_speed_decorator[True-True] 18.7738ms 18.5991ms 53.7660 Ops/s 53.9133 Ops/s $\color{#d91a1a}-0.27\%$
test_vmap_transformer_speed_decorator[True-False] 18.8279ms 18.6428ms 53.6401 Ops/s 53.8052 Ops/s $\color{#d91a1a}-0.31\%$
test_vmap_transformer_speed_decorator[False-True] 18.6236ms 18.4040ms 54.3362 Ops/s 54.1489 Ops/s $\color{#35bf28}+0.35\%$
test_vmap_transformer_speed_decorator[False-False] 18.6290ms 18.3856ms 54.3904 Ops/s 54.1862 Ops/s $\color{#35bf28}+0.38\%$
test_to_module_speed[True] 1.0507ms 0.9512ms 1.0513 KOps/s 1.0349 KOps/s $\color{#35bf28}+1.58\%$
test_to_module_speed[False] 1.3704ms 0.9395ms 1.0644 KOps/s 1.0525 KOps/s $\color{#35bf28}+1.14\%$
test_tc_init 0.1355ms 34.7732μs 28.7578 KOps/s 28.7412 KOps/s $\color{#35bf28}+0.06\%$
test_tc_init_nested 0.1835ms 70.4558μs 14.1933 KOps/s 14.2180 KOps/s $\color{#d91a1a}-0.17\%$
test_tc_first_layer_tensor 29.2900μs 0.7899μs 1.2659 MOps/s 1.2472 MOps/s $\color{#35bf28}+1.50\%$
test_tc_first_layer_nontensor 27.8900μs 2.2308μs 448.2766 KOps/s 457.9040 KOps/s $\color{#d91a1a}-2.10\%$
test_tc_second_layer_tensor 28.6500μs 1.4829μs 674.3553 KOps/s 715.5921 KOps/s $\textbf{\color{#d91a1a}-5.76\%}$
test_tc_second_layer_nontensor 28.8610μs 2.9356μs 340.6512 KOps/s 342.7305 KOps/s $\color{#d91a1a}-0.61\%$
test_unbind 0.2782s 11.0627ms 90.3937 Ops/s 143.5928 Ops/s $\textbf{\color{#d91a1a}-37.05\%}$
test_full_like 14.1386ms 11.0907ms 90.1658 Ops/s 89.8247 Ops/s $\color{#35bf28}+0.38\%$
test_zeros_like 6.3344ms 5.0058ms 199.7665 Ops/s 210.1786 Ops/s $\color{#d91a1a}-4.95\%$
test_ones_like 6.2493ms 5.0144ms 199.4246 Ops/s 203.3744 Ops/s $\color{#d91a1a}-1.94\%$
test_clone 14.8226ms 10.5330ms 94.9393 Ops/s 128.4608 Ops/s $\textbf{\color{#d91a1a}-26.09\%}$
test_squeeze 57.7200μs 9.6064μs 104.0978 KOps/s 103.4161 KOps/s $\color{#35bf28}+0.66\%$
test_unsqueeze 0.1526ms 71.9739μs 13.8939 KOps/s 13.9151 KOps/s $\color{#d91a1a}-0.15\%$
test_split 0.3704ms 0.1586ms 6.3053 KOps/s 6.3972 KOps/s $\color{#d91a1a}-1.44\%$
test_permute 0.2525ms 0.1836ms 5.4476 KOps/s 5.7572 KOps/s $\textbf{\color{#d91a1a}-5.38\%}$
test_stack 55.7966ms 53.3705ms 18.7369 Ops/s 18.5921 Ops/s $\color{#35bf28}+0.78\%$
test_cat 55.8202ms 53.8535ms 18.5689 Ops/s 18.7194 Ops/s $\color{#d91a1a}-0.80\%$

vmoens pushed a commit that referenced this pull request Feb 26, 2025
ghstack-source-id: f6c61cd
Pull Request resolved: #1232

(cherry picked from commit e23ce5c)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0