8000 Fix ACT temporal ensembling by alexander-soare · Pull Request #319 · huggingface/lerobot · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Fix ACT temporal ensembling #319

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

alexander-soare
Copy link
Contributor
@alexander-soare alexander-soare commented Jul 12, 2024

What this does

Fixes an issue with the weighting scheme for the temporal ensembling:

image

The fix is to directly use the exponential weighting scheme referred to in Algo2 of https://arxiv.org/abs/2304.13705

image

Here I implement it in an online update fashion so we don't have the ugliness of storing a cache of actions.

How it was tested

I added a test for CI.

I tried eval'ing https://huggingface.co/lerobot/act_aloha_sim_transfer_cube_human/tree/main for 500 episodes with temporal ensembling.

Edit: There was a bug in ACT so this table and subsequent commentary was edited on 17 July with the bug fixed.

Setup Success rate
Without temporal ensembling 87.6%
Prior implementation α=0.99 72.6%
This implementation m=0.1 63.8%
This implementation m=0.01 73.8%
This implementation m=0 76.8%
This implementation m=-0.01 79.0%
This implementation m=-0.1 58.6%
n_action_steps=1 with no ensembling (50 episodes only) 2.0%

Here's an episode for m=0.01:

eval_episode_2.mp4

How to checkout & try? (for the reviewer)

Try it with python lerobot/scripts/eval.py -p lerobot/act_aloha_sim_transfer_cube_human eval.n_episodes=10 eval.batch_size=10 +policy.temporal_ensemble_coeff=0.01 policy.n_action_steps=1

@alexander-soare alexander-soare added bug Something isn’t working correctly policies Items related to robot policies labels Jul 12, 2024
online_avg = ensembler.update(actions)
# Simple offline calculation: avg = Σ(aᵢ*wᵢ) / Σ(wᵢ).
# Note: The complicated bit here is the slicing. Think about the (episode_length, chunk_size) grid.
# What we want to do is take diagonal slices across it starting from the left.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: I think this gets a little hairy for a "simple" test, but I really wanted to make sure it's properly checked. I hope the explanation is enough to make the reviewer feel comfortable that this test is doing what it's supposed to. Perhaps it's enough to know that we do the same thing with two approaches and get the same answer.

@alexander-soare
Copy link
Contributor Author

@Alternmill for review please.

8000

@alexander-soare alexander-soare force-pushed the fix_act_temporal_ensembling branch from 0bb6211 to 7dc4765 Compare July 15, 2024 08:38
@alexander-soare alexander-soare self-assigned this Jul 15, 2024
Copy link
Collaborator
@Cadene Cadene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

It seems that this change isn't backward compatible, no?

If this is the case, I am fine with that, but we should warn people on discord that they can't load a checkpoint that has temporal_ensemble_momentum in the config. And ideally provide a minimal procedure to update their checkpoint config.
Also, I am wondering why this backward compatibility breaking change is not captured in our unit tests when we load a model checkpoint.

@alexander-soare
Copy link
Contributor Author
alexander-soare commented Jul 15, 2024

@Cadene it doesn't break unit tests because of this

def _policy_cfg_from_hydra_cfg(policy_cfg_class, hydra_cfg):
expected_kwargs = set(inspect.signature(policy_cfg_class).parameters)
if not set(hydra_cfg.policy).issuperset(expected_kwargs):
logging.warning(
f"Hydra config is missing arguments: {set(expected_kwargs).difference(hydra_cfg.policy)}"
)
# OmegaConf.to_container returns lists where sequences are found, but our dataclasses use tuples to avoid
# issues with mutable defaults. This filter changes all lists to tuples.
def list_to_tuple(item):
return tuple(item) if isinstance(item, list) else item
policy_cfg = policy_cfg_class(
**{
k: list_to_tuple(v)
for k, v in OmegaConf.to_container(hydra_cfg.policy, resolve=True).items()
if k in expected_kwargs
}
)
return policy_cfg
, which means temporal_ensemble_momentum will be ignored and temporal_ensemble_coeff will get a warning.
I think we should probably consider making the former case raise an exception, but there may be a good reasons I didn't do that in the first place.

Yes, I'll mention it on 8000 Discord.

@Cadene
Copy link
Collaborator
Cadene commented Jul 15, 2024

@alexander-soare Could be better to raise an exception to avoid people overriding an argument from command line and it is actually ignored, but they didnt see the warning.

When we go out of alpha into beta, we should try our best to be backward compatible instead of raising exceptions.

@alexander-soare
Copy link
Contributor Author
alexander-soare commented Jul 15, 2024

@Cadene I'm not sure I understand. I suggested raising an exception if someone provides an unknown param. Are you saying something else?

Let's take this discussion off this PR though :)

@alexander-soare alexander-soare merged commit c0101f0 into huggingface:main Jul 16, 2024
5 checks passed
@alexander-soare alexander-soare deleted the fix_act_temporal_ensembling branch July 16, 2024 09:27
amandip7 pushed a commit to amandip7/lerobot that referenced this pull request Oct 10, 2024
menhguin pushed a commit to menhguin/lerobot that referenced this pull request Feb 9, 2025
Kalcy-U referenced this pull request in Kalcy-U/lerobot May 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn’t working correctly policies Items related to robot policies
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0