-
Notifications
You must be signed in to change notification settings - Fork 93
[BugFix] Faster and safer non-tensor stack #1232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 43.8330μs | 20.5702μs | 48.6139 KOps/s | 47.9152 KOps/s | |
test_plain_set_stack_nested | 50.2340μs | 20.8808μs | 47.8908 KOps/s | 47.6908 KOps/s | |
test_plain_set_nested_inplace | 58.1190μs | 22.8923μs | 43.6829 KOps/s | 43.3637 KOps/s | |
test_plain_set_stack_nested_inplace | 49.9930μs | 22.8930μs | 43.6814 KOps/s | 43.7785 KOps/s | |
test_items | 24.1560μs | 4.1983μs | 238.1937 KOps/s | 240.4976 KOps/s | |
test_items_nested | 0.6107ms | 0.4112ms | 2.4320 KOps/s | 2.4507 KOps/s | |
test_items_nested_locked | 0.8519ms | 0.4101ms | 2.4384 KOps/s | 2.4449 KOps/s | |
test_items_nested_leaf | 0.1327ms | 77.2905μs | 12.9382 KOps/s | 13.0288 KOps/s | |
test_items_stack_nested | 0.5859ms | 0.4125ms | 2.4244 KOps/s | 2.4066 KOps/s | |
test_items_stack_nested_leaf | 0.1497ms | 77.6208μs | 12.8831 KOps/s | 12.8733 KOps/s | |
test_items_stack_nested_locked | 0.6894ms | 0.4104ms | 2.4368 KOps/s | 2.4494 KOps/s | |
test_keys | 38.7630μs | 3.4557μs | 289.3746 KOps/s | 289.6163 KOps/s | |
test_keys_nested | 0.2692ms | 0.1666ms | 6.0025 KOps/s | 6.0645 KOps/s | |
test_keys_nested_locked | 1.8315ms | 0.1748ms | 5.7210 KOps/s | 5.8503 KOps/s | |
test_keys_nested_leaf | 0.2471ms | 0.1462ms | 6.8392 KOps/s | 6.9287 KOps/s | |
test_keys_stack_nested | 0.2707ms | 0.1668ms | 5.9962 KOps/s | 5.9786 KOps/s | |
test_keys_stack_nested_leaf | 0.2605ms | 0.1462ms | 6.8415 KOps/s | 6.9416 KOps/s | |
test_keys_stack_nested_locked | 0.2741ms | 0.1725ms | 5.7983 KOps/s | 5.8178 KOps/s | |
test_values | 5.1976μs | 1.0562μs | 946.7937 KOps/s | 957.7968 KOps/s | |
test_values_nested | 0.1129ms | 63.2184μs | 15.8182 KOps/s | 15.9557 KOps/s | |
test_values_nested_locked | 0.1521ms | 63.1989μs | 15.8231 KOps/s | 15.8933 KOps/s | |
test_values_nested_leaf | 0.1232ms | 72.0065μs | 13.8876 KOps/s | 13.8911 KOps/s | |
test_values_stack_nested | 0.1311ms | 63.7311μs | 15.6909 KOps/s | 15.8585 KOps/s | |
test_values_stack_nested_leaf | 0.1281ms | 72.2092μs | 13.8486 KOps/s | 13.5612 KOps/s | |
test_values_stack_nested_locked | 0.1172ms | 63.3473μs | 15.7860 KOps/s | 16.0324 KOps/s | |
test_membership | 16.4800μs | 0.8665μs | 1.1541 MOps/s | 1.1591 MOps/s | |
test_membership_nested | 31.6290μs | 2.8685μs | 348.6094 KOps/s | 351.1008 KOps/s | |
test_membership_nested_leaf | 23.3630μs | 2.9100μs | 343.6367 KOps/s | 346.6222 KOps/s | |
test_membership_stacked_nested | 16.4710μs | 2.8894μs | 346.0904 KOps/s | 349.5656 KOps/s | |
test_membership_stacked_nested_leaf | 48.6010μs | 2.8539μs | 350.4014 KOps/s | 349.7582 KOps/s | |
test_membership_nested_last | 31.0480μs | 4.3617μs | 229.2698 KOps/s | 227.3615 KOps/s | |
test_membership_nested_leaf_last | 27.1910μs | 4.3913μs | 227.7230 KOps/s | 231.9385 KOps/s | |
test_membership_stacked_nested_last | 28.5440μs | 4.4368μs | 225.3888 KOps/s | 229.1118 KOps/s | |
test_membership_stacked_nested_leaf_last | 27.1410μs | 4.3378μs | 230.5308 KOps/s | 231.3882 KOps/s | |
test_nested_getleaf | 54.0190μs | 10.6318μs | 94.0575 KOps/s | 92.9330 KOps/s | |
test_nested_get | 46.0570μs | 10.0639μs | 99.3655 KOps/s | 97.3968 KOps/s | |
test_stacked_getleaf | 41.9890μs | 10.6077μs | 94.2711 KOps/s | 93.4146 KOps/s | |
test_stacked_get | 38.7430μs | 10.0135μs | 99.8654 KOps/s | 97.4105 KOps/s | |
test_nested_getitemleaf | 34.3440μs | 11.4377μs | 87.4301 KOps/s | 87.9445 KOps/s | |
test_nested_getitem | 41.1170μs | 10.7690μs | 92.8587 KOps/s | 92.3582 KOps/s | |
tes 8000 t_stacked_getitemleaf | 41.7380μs | 11.2172μs | 89.1487 KOps/s | 89.3054 KOps/s | |
test_stacked_getitem | 43.6320μs | 10.7809μs | 92.7562 KOps/s | 93.8362 KOps/s | |
test_lock_nested | 0.7466ms | 0.4232ms | 2.3627 KOps/s | 2.4583 KOps/s | |
test_lock_stack_nested | 0.6742ms | 0.4328ms | 2.3105 KOps/s | 2.3568 KOps/s | |
test_unlock_nested | 0.5023ms | 0.3495ms | 2.8613 KOps/s | 3.0199 KOps/s | |
test_unlock_stack_nested | 0.6365ms | 0.3519ms | 2.8420 KOps/s | 2.9324 KOps/s | |
test_flatten_speed | 0.1770ms | 0.1011ms | 9.8867 KOps/s | 9.9926 KOps/s | |
test_unflatten_speed | 0.7175ms | 0.5284ms | 1.8924 KOps/s | 1.9013 KOps/s | |
test_common_ops | 1.3362ms | 0.8154ms | 1.2264 KOps/s | 1.2075 KOps/s | |
test_creation | 29.7860μs | 2.4584μs | 406.7655 KOps/s | 403.5105 KOps/s | |
test_creation_empty | 44.4730μs | 12.1964μs | 81.9914 KOps/s | 82.7809 KOps/s | |
test_creation_nested_1 | 53.6710μs | 15.2923μs | 65.3922 KOps/s | 67.1445 KOps/s | |
test_creation_nested_2 | 46.5080μs | 19.9933μs | 50.0169 KOps/s | 51.4949 KOps/s | |
test_clone | 54.6120μs | 13.7608μs | 72.6704 KOps/s | 72.3065 KOps/s | |
test_getitem[int] | 0.7490ms | 12.9701μs | 77.1003 KOps/s | 78.5518 KOps/s | |
test_getitem[slice_int] | 0.1309ms | 25.0781μs | 39.8755 KOps/s | 40.8652 KOps/s | |
test_getitem[range] | 0.1741ms | 49.2660μs | 20.2980 KOps/s | 19.7576 KOps/s | |
test_getitem[tuple] | 0.1242ms | 20.4503μs | 48.8991 KOps/s | 49.1822 KOps/s | |
test_getitem[list] | 0.1679ms | 45.5225μs | 21.9672 KOps/s | 22.1838 KOps/s | |
test_setitem_dim[int] | 60.4230μs | 26.3104μs | 38.0078 KOps/s | 37.8874 KOps/s | |
test_setitem_dim[slice_int] | 83.7370μs | 52.4830μs | 19.0538 KOps/s | 19.5120 KOps/s | |
test_setitem_dim[range] | 0.1925ms | 77.9214μs | 12.8334 KOps/s | 12.9758 KOps/s | |
test_setitem_dim[tuple] | 76.3540μs | 41.8956μs | 23.8688 KOps/s | 24.3641 KOps/s | |
test_setitem | 57.1270μs | 21.0313μs | 47.5481 KOps/s | 47.6606 KOps/s | |
test_set | 0.1020ms | 20.6976μs | 48.3149 KOps/s | 48.9220 KOps/s | |
test_set_shared | 3.8428ms | 0.1880ms | 5.3183 KOps/s | 5.3966 KOps/s | |
test_update | 0.1185ms | 23.4291μs | 42.6820 KOps/s | 42.0942 KOps/s | |
test_update_nested | 95.1590μs | 34.3418μs | 29.1190 KOps/s | 28.4994 KOps/s | |
test_update__nested | 0.4555ms | 34.3681μs | 29.0968 KOps/s | 29.1196 KOps/s | |
test_set_nested | 0.1100ms | 22.8921μs | 43.6833 KOps/s | 44.1554 KOps/s | |
test_set_nested_new | 66.1040μs | 27.9520μs | 35.7756 KOps/s | 36.4505 KOps/s | |
test_select | 0.1004ms | 44.4263μs | 22.5092 KOps/s | 22.8852 KOps/s | |
test_select_nested | 0.1240ms | 65.3240μs | 15.3083 KOps/s | 15.5600 KOps/s | |
test_exclude_nested | 0.1701ms | 84.6253μs | 11.8168 KOps/s | 11.9142 KOps/s | |
test_empty[True] | 0.5549ms | 0.4151ms | 2.4091 KOps/s | 2.4281 KOps/s | |
test_empty[False] | 6.8878μs | 1.3733μs | 728.1843 KOps/s | 715.8156 KOps/s | |
test_unbind_speed | 0.3950ms | 0.2736ms | 3.6551 KOps/s | 3.6674 KOps/s | |
test_unbind_speed_stack0 | 0.5488ms | 0.2718ms | 3.6786 KOps/s | 3.7670 KOps/s | |
test_unbind_speed_stack1 | 0.1123s | 0.7411ms | 1.3493 KOps/s | 1.2389 KOps/s | |
test_split | 98.5888ms | 1.7952ms | 557.0298 Ops/s | 625.4581 Ops/s | |
test_chunk | 0.1130s | 1.8343ms | 545.1592 Ops/s | 512.4437 Ops/s | |
test_consolidate_njt[False-None] | 9.1186ms | 8.3377ms | 119.9372 Ops/s | 120.0712 Ops/s | |
test_creation[device0] | 0.2763ms | 92.8319μs | 10.7722 KOps/s | 10.6127 KOps/s | |
test_creation_from_tensor | 4.0094ms | 96.4121μs | 10.3721 KOps/s | 10.4493 KOps/s | |
test_add_one[memmap_tensor0] | 0.1275ms | 5.2010μs | 192.2722 KOps/s | 194.4549 KOps/s | |
test_contiguous[memmap_tensor0] | 10.7400μs | 0.5009μs | 1.9965 MOps/s | 1.9357 MOps/s | |
test_stack[memmap_tensor0] | 27.4720μs | 3.4348μs | 291.1359 KOps/s | 293.3070 KOps/s | |
test_memmaptd_index | 1.2941ms | 0.2310ms | 4.3286 KOps/s | 4.3878 KOps/s | |
test_memmaptd_index_astensor | 0.6679ms | 0.3183ms | 3.1420 KOps/s | 3.1843 KOps/s | |
test_memmaptd_index_op | 0.9609ms | 0.6206ms | 1.6114 KOps/s | 1.6811 KOps/s | |
test_serialize_model | 0.1240s | 0.1159s | 8.6291 Ops/s | 8.6476 Ops/s | |
test_serialize_model_pickle | 0.4685s | 0.3906s | 2.5605 Ops/s | 2.5089 Ops/s | |
test_serialize_weights | 0.1254s | 0.1156s | 8.6505 Ops/s | 7.8020 Ops/s | |
test_serialize_weights_returnearly | 0.1850s | 0.1636s | 6.1137 Ops/s | 6.2801 Ops/s | |
test_serialize_weights_pickle | 1.1616s | 0.7030s | 1.4225 Ops/s | 2.5201 Ops/s | |
test_serialize_weights_filesystem | 0.1451s | 0.1403s | 7.1268 Ops/s | 7.0511 Ops/s | |
test_serialize_model_filesystem | 0.2399s | 0.1541s | 6.4874 Ops/s | 6.4922 Ops/s | |
test_reshape_pytree | 60.7240μs | 26.3781μs | 37.9102 KOps/s | 38.0469 KOps/s | |
test_reshape_td | 88.1750μs | 32.7154μs | 30.5666 KOps/s | 30.1058 KOps/s | |
test_view_pytree | 59.9730μs | 26.2134μs | 38.1484 KOps/s | 38.1191 KOps/s | |
test_view_td | 98.4550μs | 42.1176μs | 23.7431 KOps/s | 24.1750 KOps/s | |
test_unbind_pytree | 75.9530μs | 29.4339μs | 33.9745 KOps/s | 33.6604 KOps/s | |
test_unbind_td | 0.3495ms | 40.0170μs | 24.9894 KOps/s | 25.0197 KOps/s | |
test_split_pytree | 0.1032ms | 29.1306μs | 34.3282 KOps/s | 34.0619 KOps/s | |
test_split_td | 0.6434ms | 46.2203μs | 21.6355 KOps/s | 22.0553 KOps/s | |
test_add_pytree | 70.2920μs | 36.7763μs | 27.1914 KOps/s | 27.5686 KOps/s | |
test_add_td | 0.1386ms | 61.5443μs | 16.2485 KOps/s | 17.8282 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1606ms | 66.3681μs | 15.0675 KOps/s | 15.1469 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 1.4006ms | 0.1760ms | 5.6822 KOps/s | 5.7711 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1032ms | 45.2975μs | 22.0763 KOps/s | 21.9642 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2332ms | 0.1203ms | 8.3097 KOps/s | 8.3399 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 85.3900μs | 26.8241μs | 37.2799 KOps/s | 36.0885 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1221ms | 59.1333μs | 16.9110 KOps/s | 17.1573 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1680ms | 79.3579μs | 12.6011 KOps/s | 12.6255 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1246ms | 67.4784μs | 14.8196 KOps/s | 14.8697 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2194ms | 0.1058ms | 9.4526 KOps/s | 9.3547 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.4989ms | 0.2152ms | 4.6470 KOps/s | 4.5833 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1290ms | 47.8414μs | 20.9024 KOps/s | 21.6547 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1919ms | 67.5368μs | 14.8067 KOps/s | 14.6407 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2354ms | 99.9211μs | 10.0079 KOps/s | 9.9814 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.4720ms | 0.2020ms | 4.9514 KOps/s | 4.9585 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3902ms | 0.2312ms | 4.3253 KOps/s | 4.3144 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2367ms | 0.1087ms | 9.2029 KOps/s | 9.3005 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.3005ms | 63.4209μs | 15.7677 KOps/s | 15.9229 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.5247ms | 49.3746μs | 20.2533 KOps/s | 20.7969 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.2591ms | 0.1566ms | 6.3852 KOps/s | 6.3147 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2212ms | 0.1017ms | 9.8286 KOps/s | 9.9383 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 82.2440μs | 20.6823μs | 48.3505 KOps/s | 46.6997 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1685ms | 67.1705μs | 14.8875 KOps/s | 14.8538 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1658ms | 80.7140μs | 12.3894 KOps/s | 12.2672 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1431ms | 66.8965μs | 14.9485 KOps/s | 14.9086 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.5141ms | 0.2183ms | 4.5813 KOps/s | 4.5560 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.8633ms | 1.3806ms | 724.3075 Ops/s | 730.5316 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.4737ms | 0.2127ms | 4.7004 KOps/s | 4.7065 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.4462ms | 0.8125ms | 1.2308 KOps/s | 1.2172 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.5906ms | 0.4590ms | 2.1787 KOps/s | 2.1888 KOps/s | |
test_compile_assign_and_add_stack[eager] | 3.5659ms | 2.7483ms | 363.8565 Ops/s | 365.0800 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 84.4680μs | 39.4271μs | 25.3632 KOps/s | 25.7811 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.8643ms | 33.7203μs | 29.6557 KOps/s | 29.5305 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1511ms | 31.9558μs | 31.2932 KOps/s | 32.6025 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 70.1620μs | 23.3136μs | 42.8933 KOps/s | 43.2053 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1159ms | 32.7179μs | 30.5643 KOps/s | 31.2147 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 79.2590μs | 23.1074μs | 43.2762 KOps/s | 42.5863 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1351ms | 54.0414μs | 18.5043 KOps/s | 19.1993 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.4215ms | 20.2636μs | 49.3495 KOps/s | 49.0891 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 98.4540μs | 46.1990μs | 21.6455 KOps/s | 22.0180 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 68.7880μs | 18.8399μs | 53.0788 KOps/s | 52.3173 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 98.8760μs | 46.7732μs | 21.3798 KOps/s | 21.5376 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 80.0000μs | 20.1398μs | 49.6529 KOps/s | 52.2057 KOps/s | |
test_compile_indexing[int-tensordict-c 8000 ompile] | 0.1203ms | 55.4644μs | 18.0296 KOps/s | 18.5067 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.9967ms | 20.4440μs | 48.9142 KOps/s | 49.9302 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 91.9120μs | 46.8485μs | 21.3454 KOps/s | 21.4303 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 74.4790μs | 18.7925μs | 53.2126 KOps/s | 53.0054 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1564ms | 47.3064μs | 21.1388 KOps/s | 21.4985 KOps/s | |
test_compile_indexing[int-pytree-eager] | 87.3240μs | 18.7456μs | 53.3459 KOps/s | 52.8994 KOps/s | |
test_mod_add[eager] | 0.1041ms | 36.1178μs | 27.6872 KOps/s | 26.9940 KOps/s | |
test_mod_add[compile] | 0.1456ms | 66.8901μs | 14.9499 KOps/s | 15.1505 KOps/s | |
test_mod_add[compile-overhead] | 0.1285ms | 64.3673μs | 15.5358 KOps/s | 14.8705 KOps/s | |
test_mod_wrap[eager] | 0.3863ms | 0.2304ms | 4.3408 KOps/s | 4.3209 KOps/s | |
test_mod_wrap[compile] | 2.3844ms | 0.2333ms | 4.2868 KOps/s | 4.2893 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3990ms | 0.2263ms | 4.4196 KOps/s | 4.1809 KOps/s | |
test_mod_wrap_and_backward[eager] | 13.2430ms | 11.4583ms | 87.2729 Ops/s | 77.4295 Ops/s | |
test_mod_wrap_and_backward[compile] | 12.8795ms | 11.4385ms | 87.4239 Ops/s | 88.1442 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 12.2105ms | 11.0527ms | 90.4758 Ops/s | 78.4703 Ops/s | |
test_seq_add[eager] | 0.2074ms | 0.1216ms | 8.2228 KOps/s | 8.0949 KOps/s | |
test_seq_add[compile] | 0.1639ms | 76.9513μs | 12.9952 KOps/s | 12.9020 KOps/s | |
test_seq_add[compile-overhead] | 0.1555ms | 76.5971μs | 13.0553 KOps/s | 13.2877 KOps/s | |
test_seq_wrap[eager] | 0.7388ms | 0.4629ms | 2.1602 KOps/s | 2.1782 KOps/s | |
test_seq_wrap[compile] | 0.3775ms | 0.2441ms | 4.0971 KOps/s | 4.0954 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3443ms | 0.2413ms | 4.1444 KOps/s | 4.1386 KOps/s | |
test_func_call_runtime[False-eager] | 0.7269ms | 0.5522ms | 1.8108 KOps/s | 1.8083 KOps/s | |
test_func_call_runtime[False-compile] | 0.7085ms | 0.4440ms | 2.2521 KOps/s | 2.2540 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5461ms | 0.4433ms | 2.2557 KOps/s | 2.2672 KOps/s | |
test_func_call_runtime[True-eager] | 1.2345ms | 0.7701ms | 1.2986 KOps/s | 1.2980 KOps/s | |
test_func_call_runtime[True-compile] | 0.6354ms | 0.4635ms | 2.1576 KOps/s | 2.1282 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.5515ms | 0.4686ms | 2.1340 KOps/s | 2.1600 KOps/s | |
test_func_call_cm_runtime[False-eager] | 1.0773ms | 0.5489ms | 1.8219 KOps/s | 1.8067 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.5602ms | 0.4464ms | 2.2401 KOps/s | 2.2579 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.7556ms | 0.4510ms | 2.2174 KOps/s | 2.2620 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1196ms | 0.9225ms | 1.0840 KOps/s | 1.0837 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.1596ms | 0.8167ms | 1.2245 KOps/s | 1.2344 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.2303ms | 0.8246ms | 1.2127 KOps/s | 1.2286 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 3.3584ms | 1.9284ms | 518.5612 Ops/s | 523.9568 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9650ms | 0.5361ms | 1.8652 KOps/s | 1.8491 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.6528ms | 0.5326ms | 1.8776 KOps/s | 1.8630 KOps/s | |
test_distributed | 0.3142ms | 0.1241ms | 8.0590 KOps/s | 7.7874 KOps/s | |
test_tdmodule | 90.4790μs | 27.6661μs | 36.1453 KOps/s | 36.7329 KOps/s | |
test_tdmodule_dispatch | 80.3900μs | 50.4174μs | 19.8344 KOps/s | 19.3988 KOps/s | |
test_tdseq | 64.4100μs | 31.3448μs | 31.9032 KOps/s | 33.5396 KOps/s | |
test_tdseq_dispatch | 80.6010μs | 54.3222μs | 18.4087 KOps/s | 18.0737 KOps/s | |
test_instantiation_functorch | 1.7503ms | 1.5301ms | 653.5659 Ops/s | 662.3571 Ops/s | |
test_exec_functorch | 0.3106ms | 0.1825ms | 5.4790 KOps/s | 5.4755 KOps/s | |
test_exec_functional_call | 0.3707ms | 0.1803ms | 5.5467 KOps/s | 5.6763 KOps/s | |
test_exec_td_decorator | 0.4477ms | 0.2379ms | 4.2035 KOps/s | 4.1711 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8035ms | 0.6580ms | 1.5199 KOps/s | 1.4970 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9927ms | 0.6596ms | 1.5160 KOps/s | 1.4920 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.9493ms | 0.5346ms | 1.8705 KOps/s | 1.8785 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 1.2954ms | 0.5434ms | 1.8402 KOps/s | 1.8711 KOps/s | |
test_to_module_speed[True] | 1.9583ms | 1.3491ms | 741.2379 Ops/s | 759.5928 Ops/s | |
test_to_module_speed[False] | 2.1719ms | 1.3182ms | 758.6313 Ops/s | 779.5956 Ops/s | |
test_tc_init | 91.2810μs | 45.6039μs | 21.9279 KOps/s | 22.5244 KOps/s | |
test_tc_init_nested | 0.1635ms | 91.4916μs | 10.9300 KOps/s | 11.0721 KOps/s | |
test_tc_first_layer_tensor | 22.8730μs | 1.5090μs | 662.6701 KOps/s | 659.0738 KOps/s | |
test_tc_first_layer_nontensor | 27.3710μs | 4.7023μs | 212.6637 KOps/s | 214.9562 KOps/s | |
test_tc_second_layer_tensor | 23.0730μs | 2.8360μs | 352.6117 KOps/s | 355.3047 KOps/s | |
test_tc_second_layer_nontensor | 45.1740μs | 6.0121μs | 166.3322 KOps/s | 167.7909 KOps/s | |
test_unbind | 0.2233s | 12.8990ms | 77.5254 Ops/s | 77.9915 Ops/s | |
test_full_like | 9.0403ms | 8.5046ms | 117.5828 Ops/s | 130.7495 Ops/s | |
test_zeros_like | 5.3129ms | 2.5881ms | 386.3798 Ops/s | 361.3585 Ops/s | |
test_ones_like | 4.4262ms | 3.0932ms | 323.2939 Ops/s | 284.3036 Ops/s | |
test_clone | 9.4661ms | 6.2015ms | 161.2519 Ops/s | 197.4588 Ops/s | |
test_squeeze | 58.5700μs | 12.7804μs | 78.2446 KOps/s | 77.8769 KOps/s | |
test_unsqueeze | 0.1645ms | 96.0532μs | 10.4109 KOps/s | 10.8008 KOps/s | |
test_split | 0.4354ms | 0.1964ms | 5.0918 KOps/s | 5.1707 KOps/s | |
test_permute | 0.2960ms | 0.2012ms | 4.9695 KOps/s | 4.9378 KOps/s | |
test_stack | 31.4263ms | 23.5216ms | 42.5142 Ops/s | 40.7310 Ops/s | |
test_cat | 31.2299ms | 23.6839ms | 42.2228 Ops/s | 41.1116 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1497ms | 12.9662μs | 77.1234 KOps/s | 80.3384 KOps/s | |
test_plain_set_stack_nested | 43.9910μs | 13.0501μs | 76.6280 KOps/s | 79.2891 KOps/s | |
test_plain_set_nested_inplace | 84.1210μs | 14.0473μs | 71.1882 KOps/s | 73.7594 KOps/s | |
test_plain_set_stack_nested_inplace | 42.9300μs | 13.8740μs | 72.0773 KOps/s | 74.5861 KOps/s | |
test_items | 41.3110μs | 2.8471μs | 351.2320 KOps/s | 349.7694 KOps/s | |
test_items_nested | 0.5381ms | 0.3644ms | 2.7439 KOps/s | 2.7881 KOps/s | |
test_items_nested_locked | 0.4057ms | 0.3701ms | 2.7019 KOps/s | 2.7720 KOps/s | |
test_items_nested_leaf | 0.1365ms | 60.7027μs | 16.4737 KOps/s | 16.6370 KOps/s | |
test_items_stack_nested | 0.4143ms | 0.3643ms | 2.7449 KOps/s | 2.7795 KOps/s | |
test_items_stack_nested_leaf | 0.1051ms | 60.5414μs | 16.5176 KOps/s | 16.6384 KOps/s | |
test_items_stack_nested_locked | 0.4445ms | 0.3648ms | 2.7409 KOps/s | 2.7953 KOps/s | |
test_keys | 33.0000μs | 3.4286μs | 291.6604 KOps/s | 293.8900 KOps/s | |
test_keys_nested | 0.1493ms | 87.2589μs | 11.4601 KOps/s | 11.3864 KOps/s | |
test_keys_nested_locked | 0.7861ms | 93.3730μs | 10.7097 KOps/s | 10.7749 KOps/s | |
test_keys_nested_leaf | 0.1148ms | 78.7519μs | 12.6981 KOps/s | 12.7842 KOps/s | |
test_keys_stack_nested | 0.1301ms | 87.5512μs | 11.4219 KOps/s | 11.5090 KOps/s | |
test_keys_stack_nested_leaf | 0.1190ms | 78.6285μs | 12.7180 KOps/s | 12.8160 KOps/s | |
test_keys_stack_nested_locked | 0.1419ms | 93.4313μs | 10.7030 KOps/s | 10.7982 KOps/s | |
test_values | 4.7333μs | 0.8508μs | 1.1754 MOps/s | 1.1699 MOps/s | |
test_values_nested | 66.2400μs | 37.0609μs | 26.9826 KOps/s | 27.2198 KOps/s | |
test_values_nested_locked | 67.2410μs | 38.9211μs | 25.6930 KOps/s | 25.7892 KOps/s | |
test_values_nested_leaf | 93.0910μs | 42.5401μs | 23.5072 KOps/s | 23.9832 KOps/s | |
test_values_stack_nested | 74.1710μs | 37.1366μs | 26.9276 KOps/s | 27.0046 KOps/s | |
test_values_stack_nested_leaf | 75.0810μs | 42.3829μs | 23.5944 KOps/s | 23.7289 KOps/s | |
test_values_stack_nested_locked | 76.5110μs | 39.3227μs | 25.4306 KOps/s | 25.7223 KOps/s | |
test_membership | 1.6900μs | 0.5003μs | 1.9987 MOps/s | 1.9924 MOps/s | |
test_membership_nested | 30.0905μs | 1.9648μs | 508.9628 KOps/s | 488.5372 KOps/s | |
test_membership_nested_leaf | 27.9350μs | 1.9789μs | 505.3409 KOps/s | 507.2284 KOps/s | |
test_membership_stacked_nested | 35.5800μs | 2.0774μs | 481.3819 KOps/s | 489.8336 KOps/s | |
test_membership_stacked_nested_leaf | 23.7400μs | 2.0560μs | 486.3792 KOps/s | 492.0336 KOps/s | |
test_membership_nested_last | 75.3610μs | 2.9970μs | 333.6628 KOps/s | 327.0534 KOps/s | |
test_membership_nested_leaf_last | 26.4500μs | 3.0321μs | 329.8061 KOps/s | 328.3819 KOps/s | |
test_membership_stacked_nested_last | 46.9900μs | 3.0129μs | 331.9027 KOps/s | 329.8205 KOps/s | |
test_membership_stacked_nested_leaf_last | 39.3310μs | 3.0132μs | 331.8702 KOps/s | 335.9681 KOps/s | |
test_nested_getleaf | 76.8310μs | 6.1696μs | 162.0838 KOps/s | 161.1178 KOps/s | |
test_nested_get | 30.5510μs | 5.8447μs | 171.0960 KOps/s | 169.2782 KOps/s | |
test_stacked_getleaf | 47.6810μs | 6.0654μs | 164.8700 KOps/s | 163.1100 KOps/s | |
test_stacked_get | 28.1010μs | 5.7306μs | 174.5013 KOps/s | 175.7378 KOps/s | |
test_nested_getitemleaf | 45.2210μs | 6.3663μs | 157.0766 KOps/s | 157.4891 KOps/s | |
test_nested_getitem | 32.8110μs | 6.0408μs | 165.5396 KOps/s | 165.3038 KOps/s | |
test_stacked_getitemleaf | 45.4010μs | 6.3042μs | 158.6254 KOps/s | 158.7933 KOps/s | |
test_stacked_getitem | 0.3682ms | 5.9361μs | 168.4607 KOps/s | 168.5020 KOps/s | |
test_lock_nested | 9.8100ms | 0.3448ms | 2.9003 KOps/s | 2.9036 KOps/s | |
test_lock_stack_nested | 0.4774ms | 0.3446ms | 2.9019 KOps/s | 2.9219 KOps/s | |
test_unlock_nested | 0.4278ms | 0.2801ms | 3.5698 KOps/s | 3.5470 KOps/s | |
test_unlock_stack_nested | 0.3389ms | 0.2811ms | 3.5578 KOps/s | 3.5374 KOps/s | |
test_flatten_speed | 0.1159ms | 78.8309μs | 12.6854 KOps/s | 13.0950 KOps/s | |
test_unflatten_speed | 0.3876ms | 0.3211ms | 3.1142 KOps/s | 3.1425 KOps/s | |
test_common_ops | 0.8325ms | 0.6199ms | 1.6132 KOps/s | 1.6721 KOps/s | |
test_creation | 0.1210ms | 1.7132μs | 583.6934 KOps/s | 586.0269 KOps/s | |
test_creation_empty | 47.0600μs | 9.3540μs | 106.9057 KOps/s | 120.1582 KOps/s | |
test_creation_nested_1 | 44.7710μs | 11.0055μs | 90.8635 KOps/s | 99.8041 KOps/s | |
test_creation_nested_2 | 41.6110μs | 13.7491μs | 72.7319 KOps/s | 79.4605 KOps/s | |
test_clone | 54.5510μs | 10.0742μs | 99.2632 KOps/s | 98.4947 KOps/s | |
test_getitem[int] | 1.5564ms | 10.9242μs | 91.5402 KOps/s | 95.2119 KOps/s | |
test_getitem[slice_int] | 0.1605ms | 20.8111μs | 48.0512 KOps/s | 48.7478 KOps/s | |
test_getitem[range] | 0.2359ms | 37.5799μs | 26.6100 KOps/s | 28.1728 KOps/s | |
test_getitem[tuple] | 0.1134ms | 18.2743μs | 54.7216 KOps/s | 55.6594 KOps/s | |
test_getitem[list] | 0.1311ms | 32.4310μs | 30.8347 KOps/s | 31.8889 KOps/s | |
test_setitem_dim[int] | 37.5210μs | 18.5441μs | 53.9256 KOps/s | 52.8625 KOps/s | |
test_setitem_dim[slice_int] | 67.4810μs | 37.8243μs | 26.4380 KOps/s | 27.2590 KOps/s | |
test_setitem_dim[range] | 74.0510μs | 51.5480μs | 19.3994 KOps/s | 19.6695 KOps/s | |
test_setitem_dim[tuple] | 52.1400μs | 31.6235μs | 31.6220 KOps/s | 32.1856 KOps/s | |
test_setitem | 60.2610μs | 15.1027μs | 66.2132 KOps/s | 70.2951 KOps/s | |
test_set | 0.2036ms | 14.5439μs | 68.7576 KOps/s | 72.4803 KOps/s | |
test_set_shared | 0.5918ms | 0.1564ms | 6.3938 KOps/s | 6.4774 KOps/s | |
test_update | 0.4447ms | 18.2179μs | 54.8911 KOps/s | 58.9333 KOps/s | |
test_update_nested | 66.4410μs | 23.5221μs | 42.5132 KOps/s | 44.2692 KOps/s | |
test_update__nested | 0.5390ms | 24.6247μs | 40.6097 KOps/s | 42.5987 KOps/s | |
test_set_nested | 0.1011ms | 15.7873μs | 63.3422 KOps/s | 64.7483 KOps/s | |
test_set_nested_new | 0.1045ms | 17.9405μs | 55.7397 KOps/s | 56.4229 KOps/s | |
test_select | 70.5810μs | 30.0835μs | 33.2408 KOps/s | 35.0985 KOps/s | |
test_select_nested | 79.7510μs | 42.9415μs | 23.2875 KOps/s | 23.2953 KOps/s | |
test_exclude_nested | 98.8510μs | 62.1863μs | 16.0807 KOps/s | 16.1512 KOps/s | |
test_empty[True] | 0.4023ms | 0.2919ms | 3.4256 KOps/s | 3.4557 KOps/s | |
test_empty[False] | 4.4311μs | 0.8125μs | 1.2308 MOps/s | 1.2366 MOps/s | |
test_to | 0.1175ms | 60.1779μs | 16.6174 KOps/s | 17.9979 KOps/s | |
test_to_nonblocking | 0.1094ms | 45.4820μs | 21.9867 KOps/s | 22.1692 KOps/s | |
test_unbind_speed | 0.3074ms | 0.2408ms | 4.1536 KOps/s | 4.1803 KOps/s | |
test_unbind_speed_stack0 | 0.3179ms | 0.2350ms | 4.2549 KOps/s | 4.2821 KOps/s | |
test_unbind_speed_stack1 | 0.1123s | 0.7481ms | 1.3367 KOps/s | 1.3309 KOps/s | |
test_split | 0.1043s | 1.6213ms | 616.8056 Ops/s | 621.5910 Ops/s | |
test_chunk | 0.1151s | 1.6377ms | 610.6076 Ops/s | 622.4754 Ops/s | |
test_consolidate[False-None] | 3.1330ms | 2.7601ms | 362.3096 Ops/s | 332.3167 Ops/s | |
test_consolidate[default-None] | 1.8989ms | 1.7452ms | 573.0103 Ops/s | 590.8742 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.9029ms | 1.7549ms | 569.8191 Ops/s | 579.2034 Ops/s | |
test_consolidate_njt[False-None] | 6.8521ms | 6.5568ms | 152.5126 Ops/s | 154.9531 Ops/s | |
test_to[False-False-None] | 1.8798ms | 1.7002ms | 588.1541 Ops/s | 599.0993 Ops/s | |
test_to[True-False-None] | 1.5357ms | 1.3423ms | 744.9772 Ops/s | 755.4403 Ops/s | |
test_to[within-False-None] | 4.4795ms | 4.2157ms | 237.2110 Ops/s | 241.1473 Ops/s | |
test_to[True-default-None] | 5.8201ms | 5.2689ms | 189.7942 Ops/s | 195.0482 Ops/s | |
test_to_njt[False-False-None] | 7.3226ms | 6.9302ms | 144.2953 Ops/s | 147.7863 Ops/s | |
test_to_njt[True-False-None] | 5.9120ms | 5.5844ms | 179.0708 Ops/s | 186.3449 Ops/s | |
test_to_njt[within-False-None] | 12.5549ms | 12.1121ms | 82.5624 Ops/s | 84.3894 Ops/s | |
test_creation[device0] | 0.6560ms | 82.2385μs | 12.1598 KOps/s | 12.7500 KOps/s | |
test_creation_from_tensor | 0.6442ms | 87.4978μs | 11.4289 KOps/s | 12.2424 KOps/s | |
test_add_one[memmap_tensor0] | 0.4607ms | 6.5367μs | 152.9818 KOps/s | 158.9742 KOps/s | |
test_contiguous[memmap_tensor0] | 1.8850μs | 0.4243μs | 2.3566 MOps/s | 2.3758 MOps/s | |
test_stack[memmap_tensor0] | 0.1200ms | 4.5723μs | 218.7065 KOps/s | 216.7132 KOps/s | |
test_memmaptd_index | 1.7278ms | 0.2413ms | 4.1437 KOps/s | 4.1869 KOps/s | |
test_memmaptd_index_astensor | 0.4355ms | 0.3060ms | 3.2682 KOps/s | 3.3110 KOps/s | |
test_memmaptd_index_op | 0.7241ms | 0.5876ms | 1.7018 KOps/s | 1.8076 KOps/s | |
test_serialize_model | 0.1331s | 0.1316s | 7.6015 Ops/s | 7.6146 Ops/s | |
test_serialize_model_pickle | 1.6895s | 1.3236s | 0.7555 Ops/s | 0.8247 Ops/s | |
test_serialize_weights | 0.1325s | 0.1305s | 7.6652 Ops/s | 7.6237 Ops/s | |
test_serialize_weights_returnearly | 0.3937s | 58.7566ms | 17.0194 Ops/s | 10.3987 Ops/s | |
test_serialize_weights_pickle | 1.7219s | 1.3408s | 0.7458 Ops/s | 0.8222 Ops/s | |
test_reshape_pytree | 53.2710μs | 22.6195μs | 44.2096 KOps/s | 45.8115 KOps/s | |
test_reshape_td | 60.0110μs | 26.9519μs | 37.1031 KOps/s | 36.8953 KOps/s | |
test_view_pytree | 51.0100μs | 22.4178μs | 44.6075 KOps/s | 45.7788 KOps/s | |
test_view_td | 85.8210μs | 31.5829μs | 31.6627 KOps/s | 30.0473 KOps/s | |
test_unbind_pytree | 88.4110μs | 28.2425μs | 35.4076 KOps/s | 35.7386 KOps/s | |
test_unbind_td | 0.7335ms | 36.1013μs | 27.6998 KOps/s | 27.0454 KOps/s | |
test_split_pytree | 0.1651ms | 30.1994μs | 33.1132 KOps/s | 33.7320 KOps/s | |
test_split_td | 0.8421ms | 39.8100μs | 25.1193 KOps/s | 25.5686 KOps/s | |
test_add_pytree | 0.1201ms | 33.6503μs | 29.7174 KOps/s | 30.3084 KOps/s | |
test_add_td | 0.1836ms | 51.8403μs | 19.2900 KOps/s | 21.9975 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1919ms | 0.1210ms | 8.2637 KOps/s | 8.0277 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2825ms | 0.1335ms | 7.4893 KOps/s | 7.4759 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1432ms | 94.8856μs | 10.5390 KOps/s | 10.2608 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 1.4928ms | 0.1446ms | 6.9134 KOps/s | 6.8500 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.1441ms | 24.1002μs | 41.4934 KOps/s | 43.5114 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1595ms | 29.6398μs | 33.7385 KOps/s | 34.9812 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.3767ms | 63.6600μs | 15.7085 KOps/s | 15.4352 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1870ms | 48.9061μs | 20.4473 KOps/s | 20.2693 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1873ms | 0.1413ms | 7.0780 KOps/s | 7.0865 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3470ms | 0.2181ms | 4.5840 KOps/s | 4.6606 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.2078ms | 96.3937μs | 10.3741 KOps/s | 10.3709 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.3067ms | 55.2034μs | 18.1148 KOps/s | 18.0487 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2700ms | 0.1353ms | 7.3933 KOps/s | 7.3652 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.6303ms | 0.4730ms | 2.1142 KOps/s | 2.1297 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4016ms | 0.2620ms | 3.8165 KOps/s | 3.8309 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2683ms | 0.1423ms | 7.0294 KOps/s | 7.1829 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.2182ms | 66.4795μs | 15.0422 KOps/s | 14.6427 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.2297ms | 0.1021ms | 9.7975 KOps/s | 10.3394 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.5591ms | 0.3932ms | 2.5431 KOps/s | 2.5366 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2809ms | 0.1346ms | 7.4285 KOps/s | 7.4739 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 60.3410μs | 18.9824μs | 52.6804 KOps/s | 56.1928 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.2614ms | 31.6702μs | 31.5754 KOps/s | 32.1791 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.2290ms | 69.6800μs | 14.3513 KOps/s | 14.3424 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1011ms | 51.9476μs | 19.2502 KOps/s | 19.1567 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6589ms | 0.4523ms | 2.2108 KOps/s | 2.2039 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.7870ms | 2.5616ms | 390.3783 Ops/s | 394.9628 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.8012ms | 0.4732ms | 2.1131 KOps/s | 2.1689 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.8094ms | 2.5388ms | 393.8877 Ops/s | 396.7864 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.2994ms | 0.1160ms | 8.6198 KOps/s | 9.0531 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5881ms | 79.2719μs | 12.6148 KOps/s | 12.7535 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.2751ms | 0.1093ms | 9.1467 KOps/s | 9.4830 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.2490ms | 66.1270μs | 15.1224 KOps/s | 14.9495 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.2729ms | 0.1092ms | 9.1599 KOps/s | 9.5870 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.2387ms | 67.1792μs | 14.8856 KOps/s | 15.2900 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1646ms | 0.1023ms | 9.7766 KOps/s | 9.9521 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1545ms | 17.5183μs | 57.0831 KOps/s | 59.7687 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.2512ms | 97.9367μs | 10.2107 KOps/s | 10.3395 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.1505ms | 15.9954μs | 62.5178 KOps/s | 63.9040 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.2468ms | 99.4508μs | 10.0552 KOps/s | 10.3848 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 0.1537ms | 15.9475μs | 62.7058 KOps/s | 64.6558 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2547ms | 0.1028ms | 9.7238 KOps/s | 9.9630 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.8064ms | 17.1790μs | 58.2105 KOps/s | 59.2270 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.2759ms | 98.7651μs | 10.1250 KOps/s | 10.4060 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.1911ms | 15.8209μs | 63.2074 KOps/s | 64.3994 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.2478ms | 97.5479μs | 10.2514 KOps/s | 10.2425 KOps/s | |
test_compile_indexing[int-pytree-eager] | 53.9000μs | 16.5513μs | 60.4180 KOps/s | 63.0997 KOps/s | |
test_mod_add[eager] | 0.1975ms | 38.7223μs | 25.8249 KOps/s | 27.3720 KOps/s | |
test_mod_add[compile] | 0.2431ms | 83.7092μs | 11.9461 KOps/s | 12.5151 KOps/s | |
test_mod_add[compile-overhead] | 0.3421ms | 0.1701ms | 5.8804 KOps/s | 5.6330 KOps/s | |
test_mod_wrap[eager] | 0.4334ms | 0.2556ms | 3.9124 KOps/s | 4.0693 KOps/s | |
test_mod_wrap[compile] | 0.4282ms | 0.2849ms | 3.5097 KOps/s | 3.5087 KOps/s | |
test_mod_wrap[compile-overhead] | 7.3942ms | 3.9079ms | 255.8935 Ops/s | 261.9192 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.5433ms | 1.3228ms | 756.0002 Ops/s | 745.2710 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.5782ms | 1.2557ms | 796.3944 Ops/s | 791.4436 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.4349ms | 0.9308ms | 1.0743 KOps/s | 1.0661 KOps/s | |
test_seq_add[eager] | 0.5283ms | 0.1149ms | 8.7051 KOps/s | 8.6257 KOps/s | |
test_seq_add[compile] | 0.4863ms | 90.3459μs | 11.0686 KOps/s | 11.0161 KOps/s | |
test_seq_add[compile-overhead] | 0.2796ms | 0.1286ms | 7.7768 KOps/s | 7.5874 KOps/s | |
test_seq_wrap[eager] | 0.8433ms | 0.4342ms | 2.3031 KOps/s | 2.2568 KOps/s | |
test_seq_wrap[compile] | 0.7144ms | 0.3005ms | 3.3275 KOps/s | 3.1546 KOps/s | |
test_seq_wrap[compile-overhead] | 0.4234ms | 0.2253ms | 4.4389 KOps/s | 4.3962 KOps/s | |
test_func_call_runtime[False-eager] | 0.8905ms | 0.7237ms | 1.3818 KOps/s | 1.3118 KOps/s | |
test_func_call_runtime[False-compile] | 0.8983ms | 0.7447ms | 1.3428 KOps/s | 1.3354 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5036ms | 0.3675ms | 2.7207 KOps/s | 2.7417 KOps/s | |
test_func_call_runtime[True-eager] | 1.0807ms | 0.8981ms | 1.1135 KOps/s | 1.1348 KOps/s | |
test_func_call_runtime[True-compile] | 0.9075ms | 0.7695ms | 1.2995 KOps/s | 1.3016 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.5463ms | 0.3921ms | 2.5506 KOps/s | 2.5862 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.9012ms | 0.7390ms | 1.3532 KOps/s | 1.3966 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.9332ms | 0.7569ms | 1.3211 KOps/s | 1.3362 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5079ms | 0.3666ms | 2.7274 KOps/s | 2.7185 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.2109ms | 0.9895ms | 1.0107 KOps/s | 1.0036 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.1279ms | 0.9639ms | 1.0375 KOps/s | 1.0199 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.1896ms | 0.9886ms | 1.0116 KOps/s | 981.6685 Ops/s | |
test_vmap_func_call_cm_runtime[eager] | 2.4634ms | 2.0207ms | 494.8853 Ops/s | 489.6128 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9617ms | 0.8080ms | 1.2377 KOps/s | 1.1880 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4880ms | 0.4175ms | 2.3953 KOps/s | 2.3809 KOps/s | |
test_distributed | 1.9762ms | 0.2582ms | 3.8736 KOps/s | 8.6804 KOps/s | |
test_tdmodule | 0.4332ms | 21.1979μs | 47.1745 KOps/s | 49.6230 KOps/s | |
test_tdmodule_dispatch | 65.4910μs | 35.8332μs | 27.9071 KOps/s | 27.4717 KOps/s | |
test_tdseq | 62.8410μs | 20.7996μs | 48.0778 KOps/s | 48.6259 KOps/s | |
test_tdseq_dispatch | 98.7910μs | 39.8571μs | 25.0896 KOps/s | 26.2011 KOps/s | |
test_instantiation_functorch | 1.6511ms | 1.5170ms | 659.1810 Ops/s | 655.5841 Ops/s | |
test_exec_functorch | 0.1833ms | 0.1374ms | 7.2801 KOps/s | 7.2727 KOps/s | |
test_exec_functional_call | 0.2898ms | 0.1322ms | 7.5625 KOps/s | 7.7494 KOps/s | |
test_exec_td_decorator | 0.3680ms | 0.1794ms | 5.5747 KOps/s | 5.5272 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8216ms | 0.6669ms | 1.4995 KOps/s | 1.4955 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8257ms | 0.6665ms | 1.5004 KOps/s | 1.4936 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7509ms | 0.5713ms | 1.7503 KOps/s | 1.7357 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7213ms | 0.5735ms | 1.7437 KOps/s | 1.7125 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 18.7738ms | 18.5991ms | 53.7660 Ops/s | 53.9133 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 18.8279ms | 18.6428ms | 53.6401 Ops/s | 53.8052 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 18.6236ms | 18.4040ms | 54.3362 Ops/s | 54.1489 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 18.6290ms | 18.3856ms | 54.3904 Ops/s | 54.1862 Ops/s | |
test_to_module_speed[True] | 1.0507ms | 0.9512ms | 1.0513 KOps/s | 1.0349 KOps/s | |
test_to_module_speed[False] | 1.3704ms | 0.9395ms | 1.0644 KOps/s | 1.0525 KOps/s | |
test_tc_init | 0.1355ms | 34.7732μs | 28.7578 KOps/s | 28.7412 KOps/s | |
test_tc_init_nested | 0.1835ms | 70.4558μs | 14.1933 KOps/s | 14.2180 KOps/s | |
test_tc_first_layer_tensor | 29.2900μs | 0.7899μs | 1.2659 MOps/s | 1.2472 MOps/s | |
test_tc_first_layer_nontensor | 27.8900μs | 2.2308μs | 448.2766 KOps/s | 457.9040 KOps/s | |
test_tc_second_layer_tensor | 28.6500μs | 1.4829μs | 674.3553 KOps/s | 715.5921 KOps/s | |
test_tc_second_layer_nontensor | 28.8610μs | 2.9356μs | 340.6512 KOps/s | 342.7305 KOps/s | |
test_unbind | 0.2782s | 11.0627ms | 90.3937 Ops/s | 143.5928 Ops/s | |
test_full_like | 14.1386ms | 11.0907ms | 90.1658 Ops/s | 89.8247 Ops/s | |
test_zeros_like | 6.3344ms | 5.0058ms | 199.7665 Ops/s | 210.1786 Ops/s | |
test_ones_like | 6.2493ms | 5.0144ms | 199.4246 Ops/s | 203.3744 Ops/s | |
test_clone | 14.8226ms | 10.5330ms | 94.9393 Ops/s | 128.4608 Ops/s | |
test_squeeze | 57.7200μs | 9.6064μs | 104.0978 KOps/s | 103.4161 KOps/s | |
test_unsqueeze | 0.1526ms | 71.9739μs | 13.8939 KOps/s | 13.9151 KOps/s | |
test_split | 0.3704ms | 0.1586ms | 6.3053 KOps/s | 6.3972 KOps/s | |
test_permute | 0.2525ms | 0.1836ms | 5.4476 KOps/s | 5.7572 KOps/s | |
test_stack | 55.7966ms | 53.3705ms | 18.7369 Ops/s | 18.5921 Ops/s | |
test_cat | 55.8202ms | 53.8535ms | 18.5689 Ops/s | 18.7194 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):