8000 test: speeding up tests by using dummy small models by terrykong · Pull Request #233 · NVIDIA-NeMo/RL · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

test: speeding up tests by using dummy small models #233

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

terrykong
Copy link
Contributor
@terrykong terrykong commented Apr 19, 2025

Issue

Our unit tests were acting more like functional tests and ballooned our unit test time pretty high. We need to slowly shift to mocking and test smaller units and defer these longer more involved tests to functional tests.

For now, this is a stopgap since the CI time was untenable.

Proposal

Switch all tests to using small dummy (deterministic) models.

tldr

Before

=========== 184 passed, 3 skipped, 9 warnings in 1763.80s (0:29:23) ============

After

============ 182 passed, 5 skipped, 6 warnings in 764.30s (0:12:44) ============

@terrykong terrykong added the CI:L0 Run doctests and unit tests label Apr 19, 2025
@terrykong terrykong changed the title Unit mock ray tests: speeding up tests by using dummy small models Apr 19, 2025
@terrykong terrykong changed the title tests: speeding up tests by using dummy small models test: speeding up tests by using dummy small models Apr 19, 2025
@terrykong terrykong added CI:L0 Run doctests and unit tests Run CICD Set to run CI (unset + set to rerun) and removed CI:L0 Run doctests and unit tests Run CICD Set to run CI (unset + set to rerun) labels Apr 19, 2025
Signed-off-by: Terry Kong <terryk@nvidia.com>

make native checkpoints faster

Signed-off-by: Terry Kong <terryk@nvidia.com>

fix tokenizer size

Signed-off-by: Terry Kong <terryk@nvidia.com>

add pytest line logger for debugging

Signed-off-by: Terry Kong <terryk@nvidia.com>

vllm wip

Signed-off-by: Terry Kong <terryk@nvidia.com>

wip

Signed-off-by: Terry Kong <terryk@nvidia.com>

fix

Signed-off-by: Terry Kong <terryk@nvidia.com>

wip

Signed-off-by: Terry Kong <terryk@nvidia.com>

durations=30

Signed-off-by: Terry Kong <terryk@nvidia.com>

breakpoint

fix missing test_data

Signed-off-by: Terry Kong <terryk@nvidia.com>

fix

Signed-off-by: Terry Kong <terryk@nvidia.com>

fix vocab sizes

Signed-off-by: Terry Kong <terryk@nvidia.com>

fix api issues and change the dtensor check to log prob since logit
checking is too strict for this oddly initialized play model

Signed-off-by: Terry Kong <terryk@nvidia.com>

simplify tests fsdp1 too

Signed-off-by: Terry Kong <terryk@nvidia.com>

cleanup

Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
@github-actions github-actions bot added the CI Relating to CI label Apr 20, 2025
@terrykong terrykong marked this pull request as ready for review April 20, 2025 01:06
Signed-off-by: Terry Kong <terryk@nvidia.com>
@terrykong terrykong added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L0 Run doctests and unit tests labels Apr 20, 2025
"Who is the president of the USA? Who is the president of the USA? Who is",
"What is the capital of the moon? A. Houston, Texas B. New York City",
"Where is the sun? Where is the moon? Where is the earth?",
"a b B B B B B B B B B B",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will we really catch bugs with this mock test? Should we rather switch to 126m param real model?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@parthchadha that's true. the focus of my PR was just to drive down the unit test time so admittedly the fidelity of the test signal is worsened by me doing so.

i kind of still think 126m is too high especially if we want to add more tests. one thing is I could SFT the toy model for sequences like:

a b c d e ....
0 1 2 3 4....
2 4 6 8 ....
n v i d i a .....

and maybe push to personal hub for now since we don't want to commit it binary. wdyt?

@yuki-666
Copy link
Contributor

After merge latest main branch, the threshold and refit_buffer_size_gb in test_vllm_weight_update_memory in tests/unit/models/generation/test_vllm_generation.py should also be updated, otherwise the test is meaningless.

Do you need my help to update it after you resolve the conflicts with main?

@terrykong
Copy link
Contributor Author

closing in favor of #315

@terrykong terrykong closed this May 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI:L1 Run doctests, unit tests, and functional tests CI Relating to CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0