-
Notifications
You must be signed in to change notification settings - Fork 57
test: speeding up tests by using dummy small models #233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Terry Kong <terryk@nvidia.com> make native checkpoints faster Signed-off-by: Terry Kong <terryk@nvidia.com> fix tokenizer size Signed-off-by: Terry Kong <terryk@nvidia.com> add pytest line logger for debugging Signed-off-by: Terry Kong <terryk@nvidia.com> vllm wip Signed-off-by: Terry Kong <terryk@nvidia.com> wip Signed-off-by: Terry Kong <terryk@nvidia.com> fix Signed-off-by: Terry Kong <terryk@nvidia.com> wip Signed-off-by: Terry Kong <terryk@nvidia.com> durations=30 Signed-off-by: Terry Kong <terryk@nvidia.com> breakpoint fix missing test_data Signed-off-by: Terry Kong <terryk@nvidia.com> fix Signed-off-by: Terry Kong <terryk@nvidia.com> fix vocab sizes Signed-off-by: Terry Kong <terryk@nvidia.com> fix api issues and change the dtensor check to log prob since logit checking is too strict for this oddly initialized play model Signed-off-by: Terry Kong <terryk@nvidia.com> simplify tests fsdp1 too Signed-off-by: Terry Kong <terryk@nvidia.com> cleanup Signed-off-by: Terry Kong <terryk@nvidia.com>
fbc1980
to
7347aee
Compare
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
"Who is the president of the USA? Who is the president of the USA? Who is", | ||
"What is the capital of the moon? A. Houston, Texas B. New York City", | ||
"Where is the sun? Where is the moon? Where is the earth?", | ||
"a b B B B B B B B B B B", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will we really catch bugs with this mock test? Should we rather switch to 126m param real model?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@parthchadha that's true. the focus of my PR was just to drive down the unit test time so admittedly the fidelity of the test signal is worsened by me doing so.
i kind of still think 126m is too high especially if we want to add more tests. one thing is I could SFT the toy model for sequences like:
a b c d e ....
0 1 2 3 4....
2 4 6 8 ....
n v i d i a .....
and maybe push to personal hub for now since we don't want to commit it binary. wdyt?
After merge latest main branch, the threshold and Do you need my help to update it after you resolve the conflicts with main? |
closing in favor of #315 |
Issue
Our unit tests were acting more like functional tests and ballooned our unit test time pretty high. We need to slowly shift to mocking and test smaller units and defer these longer more involved tests to functional tests.
For now, this is a stopgap since the CI time was untenable.
Proposal
Switch all tests to using small dummy (deterministic) models.
tldr
Before
After