8000 Reward classifier and training by ChorntonYoel · Pull Request #528 · huggingface/lerobot · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Reward classifier and training #528

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

ChorntonYoel
Copy link
@ChorntonYoel ChorntonYoel commented Nov 26, 2024

What this does

Title Label
Item 2 of Reward Classifier in issue #504 (Feature)

This PR is meant to add a reward classifier (used to classify if an image of a robot performing a task should get a reward or not), a training file allowing the training of the classifier (with logging + resuming), a config.yaml file that can be used to start a training, and a few tests for the training loop

How it was tested

Using 10 episodes made with the reward system of this PR: #518
Also I added a test file for the training classifier file. Lots of things are mocked but it covers the basics I believe.

How to checkout & try? (for the reviewer)

python lerobot/scripts/train_classifier.py \
    --config-name", "policy/reward_classifier.yaml",

With the wandb entity and the dataset name adapted.

I was able to reproduce 95%+ after a few epochs with facebook/convnext-base-224 as backbone and a dataset of 10 epsiodes of ~15 sec.
This branch was built on top of the branch from #518 so will need to wait for this one to be merged befre merging

@ChorntonYoel ChorntonYoel marked this pull request as ready for review November 29, 2024 16:25
@michel-aractingi
Copy link
Contributor

Nice work @ChorntonYoel ! Could you move the classifier directory to lerobot/common/policies/classifier/ to lerobot/common/policies/hilserl/classifier.

Since now we will only use the reward classifier for hil-serl then we will put everything in its directory. In the future when the classifiers are more established we can have a separate directory in lerobot/common/classifiers to host different kinds of reward recognition models.

Comment on lines 69 to 73
elif name == "classifier":
from lerobot.common.policies.classifier.configuration_classifier import ClassifierConfig
from lerobot.common.policies.classifier.modeling_classifier import Classifier

return Classifier, ClassifierConfig
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its not ideal to put the classifier in the factory.py of policies. I think we can remove and instead of relying on make_policy in the training script we can directly define the classifier there. Since the training script of the classifier is not train.py.

What do you think?

Copy link
Author
@ChorntonYoel ChorntonYoel Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this better now that the classifier has the "policy" hilserl/classifier" ?
Or do you still think it's confusing and we should initialize it in a different way?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

from lerobot.common.datasets.factory import resolve_delta_timestamps
from lerobot.common.datasets.lerobot_dataset import LeRobotDataset
from lerobot.common.logger import Logger
from lerobot.common.policies.factory import make_policy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could remove make_policy and manually define it later.

Suggested change
from lerobot.common.policies.factory import make_policy
from lerobot.common.policies.classifier.configuration_classifier import ClassifierConfig
from lerobot.common.policies.classifier.modeling_classifier import Classifier

Comment on lines 239 to 244
model = make_policy(
hydra_cfg=cfg,
dataset_stats=dataset.meta.stats if not cfg.resume else None,
pretrained_policy_name_or_path=str(logger.last_pretrained_model_dir) if cfg.resume else None,
).to(device)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can define the classifier here:

Suggested change
model = make_policy(
hydra_cfg=cfg,
dataset_stats=dataset.meta.stats if not cfg.resume else None,
pretrained_policy_name_or_path=str(logger.last_pretrained_model_dir) if cfg.resume else None,
).to(device)
from lerobot.common.policies.factory import _policy_cfg_from_hydra_cfg
classifier_cfg = _policy_cfg_from_hydra_cfg(ClassifierConfig, cfg)
if not cfg.resume:
model = Classifier(classifier_config, dataset.meta.stats)
else:
model = Classifier(classifier_config)
model.load_state_dict(Classifier.from_pretrained(str(logger.last_pretrained_model_dir)).state_dict())
model = model.to(device)

Copy link
Author
@ChorntonYoel ChorntonYoel Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outdated, but would you still prefer I do that? I don't mind

ChorntonYoel and others added 3 commits December 6, 2024 10:38
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
@michel-aractingi michel-aractingi merged commit 6490927 into huggingface:user/michel-aractingi/2024-11-27-port-hil-serl Dec 9, 2024
@michel-aractingi michel-aractingi mentioned this pull request Dec 9, 2024
helper2424 pushed a commit to helper2424/lerobot that referenced this pull request Dec 16, 2024
Co-authored-by: Daniel Ritchie <daniel@brainwavecollective.ai>
Co-authored-by: resolver101757 <kelster101757@hotmail.com>
Co-authored-by: Jannik Grothusen <56967823+J4nn1K@users.noreply.github.com>
Co-authored-by: Remi <re.cadene@gmail.com>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
helper2424 pushed a commit to helper2424/lerobot that referenced this pull request Dec 17, 2024
Co-authored-by: Daniel Ritchie <daniel@brainwavecollective.ai>
Co-authored-by: resolver101757 <kelster101757@hotmail.com>
Co-authored-by: Jannik Grothusen <56967823+J4nn1K@users.noreply.github.com>
Co-authored-by: Remi <re.cadene@gmail.com>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
michel-aractingi added a commit that referenced this pull request Jan 22, 2025
Co-authored-by: Daniel Ritchie <daniel@brainwavecollective.ai>
Co-authored-by: resolver101757 <kelster101757@hotmail.com>
Co-authored-by: Jannik Grothusen <56967823+J4nn1K@users.noreply.github.com>
Co-authored-by: Remi <re.cadene@gmail.com>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
AdilZouitine pushed a commit that referenced this pull request Mar 24, 2025
Co-authored-by: Daniel Ritchie <daniel@brainwavecollective.ai>
Co-authored-by: resolver101757 <kelster101757@hotmail.com>
Co-authored-by: Jannik Grothusen <56967823+J4nn1K@users.noreply.github.com>
Co-authored-by: Remi <re.cadene@gmail.com>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
AdilZouitine pushed a commit that referenced this pull request Mar 24, 2025
Co-authored-by: Daniel Ritchie <daniel@brainwavecollective.ai>
Co-authored-by: resolver101757 <kelster101757@hotmail.com>
Co-authored-by: Jannik Grothusen <56967823+J4nn1K@users.noreply.github.com>
Co-authored-by: Remi <re.cadene@gmail.com>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
AdilZouitine pushed a commit that referenced this pull request Mar 28, 2025
Co-authored-by: Daniel Ritchie <daniel@brainwavecollective.ai>
Co-authored-by: resolver101757 <kelster101757@hotmail.com>
Co-authored-by: Jannik Grothusen <56967823+J4nn1K@users.noreply.github.com>
Co-authored-by: Remi <re.cadene@gmail.com>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
michel-aractingi added a commit that referenced this pull request Apr 17, 2025
Co-authored-by: Daniel Ritchie <daniel@brainwavecollective.ai>
Co-authored-by: resolver101757 <kelster101757@hotmail.com>
Co-authored-by: Jannik Grothusen <56967823+J4nn1K@users.noreply.github.com>
Co-authored-by: Remi <re.cadene@gmail.com>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
michel-aractingi added a commit that referenced this pull request Apr 18, 2025
Co-authored-by: Daniel Ritchie <daniel@brainwavecollective.ai>
Co-authored-by: resolver101757 <kelster101757@hotmail.com>
Co-authored-by: Jannik Grothusen <56967823+J4nn1K@users.noreply.github.com>
Co-authored-by: Remi <re.cadene@gmail.com>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants
0