8000 [Feature] Explicit failure for unmatched model and checkpoints · Issue #415 · NVIDIA/NeMo-RL · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

[Feature] Explicit failure for unmatched model and checkpoints #415

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
KiddoZhu opened this issue May 19, 2025 · 0 comments
Open

[Feature] Explicit failure for unmatched model and checkpoints #415

KiddoZhu opened this issue May 19, 2025 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@KiddoZhu
Copy link
Collaborator

Currently nemo-rl always tries to resume from the last checkpoint in the checkpoint path. When we change the policy model, the new model will fail silently at loading old checkpoints, resulting in two negative consequences:

  1. New checkpoints will overwrite old checkpoints from a different model.
  2. The training step is counted from the old checkpoint, even if the new model is actually trained from scratch.

I feel it's better to fail explicitly when the policy model doesn't match the checkpoints, to prevent such undefined behaviors.

@parthchadha parthchadha added the bug Something isn't working label May 20, 2025
@parthchadha parthchadha self-assigned this May 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: No status
Development

No branches or pull requests

2 participants
0