8000 VRAM for full-finetuning · Issue #10 · fal-ai/f-lite · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

VRAM for full-finetuning #10

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
alfredplpl opened this issue May 6, 2025 · 4 comments
8000 Closed

VRAM for full-finetuning #10

alfredplpl opened this issue May 6, 2025 · 4 comments

Comments

@alfredplpl
Copy link
Contributor
alfredplpl commented May 6, 2025

Great work!

I would like to full-finetuning F-Lite.
But, I cannot full-finetuning it by A6000 (48GB).

How much VRAM do we full-finetuning?

I have tested these options:

  • precomputed embeddings (512x512, batch size: 1)
  • 8bit adam
  • gradient checkpointing
  • mixed precision: bf16

Thanks in advance!

@erwold
Copy link
Collaborator
erwold commented May 6, 2025

Not sure if this is the reason, but could you try setting gradient_checkpoint to true in your model’s config.json file?

Image

@alfredplpl
Copy link
Contributor Author
alfredplpl commented May 6, 2025

Thank you for your advice.

I modified the json.
But, the training code occured the OOM error.

8000
@erwold
Copy link
Collaborator
erwold commented May 6, 2025

Oh, I know, you are doing normal finetuning instead of LoRA. In finetuning, the vram cost will go to about ~80G, sorry because the model is so big...

@alfredplpl
Copy link
Contributor Author

I see.

The 10B model is too big to finetuning.
I will use RTX PRO 6000 (96GB).

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
0