-
Notifications
You must be signed in to change notification settings - Fork 25
feat: Add support for hydra style overrides #80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Are we currently mandating that the config be read-only after overrides are applied? If not, can that be added? |
The config is converted to a dict after overrides are applied. The dict can be modified at any place in the code as of now. If we want to freeze the dict then we will need to convert it to a |
We currently have places where we modify the configs: https://github.com/NVIDIA/reinforcer/blob/main/examples/run_grpo_math.py#L255 how should we deal with this sort of thing? It's not actually obvious to me if we should hard prevent people from using the configs for global variable passing for minor things like this. |
Could we do frozen dict as default and just expect the user to manually unfreeze to add? I agree forbidding users from adding things into the config is a little extreme. @hemildesai where did tron end up falling here? |
Yeah I think we can enforce freezing and unfreezing of the config dict if/when needed. However, that would be outside the scope of this PR. Let's move the discussion to #106
|
Signed-off-by: Hemil Desai <hemild@nvidia.com>
0b017d7
to
ec4a71d
Compare
Signed-off-by: Hemil Desai <hemild@nvidia.com>
Signed-off-by: Hemil Desai <hemild@nvidia.com> Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>
Signed-off-by: Hemil Desai <hemild@nvidia.com>
What does this PR do ?
Adds support for Hydra style overrides. See https://hydra.cc/docs/advanced/override_grammar/basic/#basic-override-syntax and https://hydra.cc/docs/advanced/override_grammar/extended/#extended-override-syntax
Issues
Closes #60
Usage
python examples/run_grpo_math.py +new_key="sort([1,2,3,4])"
Before your PR is "Ready for review"
Pre checks:
Additional Information