8000 docs: Add missing arguments to DeepScaler evaluation by butsugiri · Pull Request #502 · NVIDIA-NeMo/RL · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

docs: Add missing arguments to DeepScaler evaluation #502

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

butsugiri
Copy link
@butsugiri butsugiri commented Jun 11, 2025

What does this PR do ?

This PR attempts to fix the document for deepscaler experiments

Currently, necessary arguments are missing, which leads to poor evaluation results

============================================================
model_name='step_300-hf' dataset_name='aime_2024'
max_new_tokens=2048 temperature=0.0 top_p=1.0 top_k=-1

metric='pass@1' num_tests_per_prompt=1

score=0.0333 (1.0/30)
============================================================

By specifying cot.txt (as is done in training setup), the result improves a bit.

============================================================
model_name='step_300-hf' dataset_name='aime_2024'
max_new_tokens=2048 temperature=0.0 top_p=1.0 top_k=-1

metric='pass@1' num_tests_per_prompt=1

score=0.1333 (4.0/30)
============================================================

By allowing the generation of more than 2048 tokens, the result is even better (This PR)

============================================================
model_name='step_300-hf' dataset_name='aime_2024'
max_new_tokens=8192 temperature=0.0 top_p=1.0 top_k=-1

metric='pass@1' num_tests_per_prompt=1

score=0.3667 (11.0/30)
============================================================

Issues

n/a

Usage

n/a

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests? --> n/a
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests --> n/a
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs. --> n/a

Signed-off-by: Shun Kiyono <shun.kiyono@sbintuitions.co.jp>
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Jun 11, 2025
@butsugiri butsugiri changed the title Add missing arguments to DeepScaler evaluation docs: Add missing arguments to DeepScaler evaluation Jun 11, 2025
@parthchadha parthchadha requested a review from abukharin-nv June 26, 2025 22:19
@SahilJain314
Copy link
Contributor

Thanks for the PR! Slipped past us for a bit.

Copy link
Contributor
@abukharin-nv abukharin-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I would also suggest increasing max_len to 32K, but that is kind of a subjective choice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0