You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi! We refer to section 4.2 of our paper for details of DPO. We use the same codebase as ChatGLM-RLHF. We currently do not have plan to release the code and data for DPO.
Any plans on releasing the DPO code, or a brief intro of how you conducted long-context DPO?
The text was updated successfully, but these errors were encountered: