8000 Helper2424/check hilserl by helper2424 · Pull Request #1047 · huggingface/lerobot · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Helper2424/check hilserl #1047

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 192 commits into
base: main
Choose a base branch
8000
from

Conversation

helper2424
Copy link
Contributor

What this does

Explain what this PR does. Feel free to tag your PR with the appropriate label(s).

Examples:

Title Label
Fixes #[issue] (🐛 Bug)
Adds new dataset (🗃️ Dataset)
Optimizes something (⚡️ Performance)

How it was tested

Explain/show how you tested your changes.

Examples:

  • Added test_something in tests/test_stuff.py.
  • Added new_feature and checked that training converges with policy X on dataset/environment Y.
  • Optimized some_function, it now runs X times faster than previously.

How to checkout & try? (for the reviewer)

Provide a simple way for the reviewer to try out your changes.

Examples:

pytest -sx tests/test_stuff.py::test_something
python lerobot/scripts/train.py --some.option=true

SECTION TO REMOVE BEFORE SUBMITTING YOUR PR

Note: Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR. Try to avoid tagging more than 3 people.

Note: Before submitting this PR, please read the contributor guideline.

ChorntonYoel and others added 30 commits April 18, 2025 15:02
Co-authored-by: Daniel Ritchie <daniel@brainwavecollective.ai>
Co-authored-by: resolver101757 <kelster101757@hotmail.com>
Co-authored-by: Jannik Grothusen <56967823+J4nn1K@users.noreply.github.com>
Co-authored-by: Remi <re.cadene@gmail.com>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
…licy on the robot (huggingface#541)

Co-authored-by: Yoel <yoel.chornton@gmail.com>
Co-authored-by: Yoel <yoel.chornton@gmail.com>
Co-authored-by: KeWang1017 <ke.wang@helloleap.ai>
…ing logic

- Added `num_subsample_critics`, `critic_target_update_weight`, and `utd_ratio` to SACConfig.
- Implemented target entropy calculation in SACPolicy if not provided.
- Introduced subsampling of critics to prevent overfitting during updates.
- Updated temperature loss calculation to use the new target entropy.
- Added comments for future UTD update implementation.

These changes improve the flexibility and performance of the SAC implementation.
…n handling

- Updated action selection to use distribution sampling and log probabilities for better stochastic behavior.
- Enhanced standard deviation clamping to prevent extreme values, ensuring stability in policy outputs.
- Cleaned up code by removing unnecessary comments and improving readability.

These changes aim to refine the SAC implementation, enhancing its robustness and performance during training and inference.
- Updated standard deviation parameterization in SACConfig to 'softplus' with defined min and max values for improved stability.
- Modified action sampling in SACPolicy to use reparameterized sampling, ensuring better gradient flow and log probability calculations.
- Cleaned up log probability calculations in TanhMultivariateNormalDiag for clarity and efficiency.
- Increased evaluation frequency in YAML configuration to 50000 for more efficient training cycles.

These changes aim to enhance the robustness and performance of the SAC implementation during training and inference.
…d stability

- Updated SACConfig to replace standard deviation parameterization with log_std_min and log_std_max for better control over action distributions.
- Modified SACPolicy to streamline action selection and log probability calculations, enhancing stochastic behavior.
- Removed deprecated TanhMultivariateNormalDiag class to simplify the codebase and improve maintainability.

These changes aim to enhance the robustness and performance of the SAC implementation during training and inference.
michel-aractingi and others added 30 commits April 18, 2025 16:18
…ology with 'discrete_critic'. Update related methods and comments for clarity and consistency in handling discrete actions.
The test was inadvertently comparing uninitialized parts of the array,
which could lead to inconsistent or undefined results. This fix ensures
only the relevant, properly initialized sections are checked.

Co-authored-by: Eugene Mironov <helper2424@gmail.com>
… lerobot (huggingface#1018)

Co-authored-by: imstevenpmwork <steven.palma@huggingface.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants
0