8000 GitHub - Paul33333/SFT-and-DPO: This is a detailed code demo on how to conduct Full-Param Supervised Fine-tuning (SFT) and DPO (Direct Preference Optimization)
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

This is a detailed code demo on how to conduct Full-Param Supervised Fine-tuning (SFT) and DPO (Direct Preference Optimization)

License

Notifications You must be signed in to change notification settings

Paul33333/SFT-and-DPO

Repository files navigation

SFT-and-DPO

This is a detailed code demo on how to conduct Full-Param Supervised Fine-tuning (SFT) and DPO (Direct Preference Optimization)

详细介绍参看:https://zhuanlan.zhihu.com/p/715250294

About

This is a detailed code demo on how to conduct Full-Param Supervised Fine-tuning (SFT) and DPO (Direct Preference Optimization)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0