What's Changed
- [example] fix runtime env by @hiyouga in #224
- Update Awesome Work using EasyR1 by @RainBowLuoCS in #240
- Update Awesome Work using EasyR1 by @xyliugo in #239
- [trainer] support async reward by @hiyouga in #252
- [readme] add baselines by @hiyouga in #253
- [script] fix merge script by @hiyouga in #254
- [misc] update baselines & docker image by @hiyouga in #256
- [readme] update baseline by @hiyouga in #258
- [data] support custom chat template by @hiyouga in #270
- [reward] support batch reward by @hiyouga in #271
- [example] change env vars by @hiyouga in #272
- [Readme] Add awesome work using EasyR1 by @Wangbiao2 in #273
- [model] add qwen3 support by @hiyouga in #276
- [example] update script by @hiyouga in #277
- [readme] update wechat by @hiyouga in #280
- [misc] fix logger by @hiyouga in #288
- [readme] update wechat by @hiyouga in #292
- add get model from modelscope by @Saigyouji-Yuyuko1000 in #297
- [readme] update wechat by @hiyouga in #301
- Add a new work based on EasyR1 by @LiuRicky in #303
- add new work based on EasyR1 by @waltonfuture in #313
- [logger] fix tensorboard by @hiyouga in #316
- Add a new work based on EasyR1 by @Gabesarch in #325
- [misc] fix console hanging by @hiyouga in #293
- [misc] several update by @hiyouga in #329
- Update README.md by @CSfufu in #330
- [perf] pass raw image data between workers by @tongxiao2002 in #318
- [readme] add our work using EasyR1 by @kxfan2002 in #331
- add our work using EasyR1 by @YutingLi0606 in #337
- [data] fix position ids for qwen2vl mrope & add test by @hiyouga in #339
- [worker] colocate actor and ref model by @hiyouga in #342
- [trainer] save best checkpoint by @hiyouga in #343
- [trainer] fix bug by @hiyouga in #344
- [utils] update data protocol by @hiyouga in #345
- [trainer] repeat rollout and prepare filter by @hiyouga in #346
- [worker] expose rollout manager by @hiyouga in #347
- [worker] fix vllm sharding manager by @hiyouga in #348
- fix: bug by @gdw439 in #350
- [trainer] fix progress bar by @hiyouga in #355
- [readme] update docker image by @hiyouga in #357
- [trainer] add online filtering by @Saigyouji-Yuyuko1000 in #358
- [worker] update reward manager by @hiyouga in #360
- Fix/vllm processor cache for text only model by @cyc00518 in #359
- [breaking] support text-image mixed data by @hiyouga in #361
- [model] fix qwen2vl bug by @hiyouga in #363
- [tracking] add tensorboard exp name by @hiyouga in #365
- [worker] do not load ref if kl is disabled by @hiyouga in #366
- [worker] fix skip ref model by @hiyouga in #367
- [examples] add qwen3_14b_dapo17k_dapo by @Saigyouji-Yuyuko1000 in #369
- [release] 0.3.1 by @hiyouga in #370
New Contributors
- @RainBowLuoCS made their first contribution in #240
- @xyliugo made their first contribution in #239
- @Saigyouji-Yuyuko1000 made their first contribution in #297
- @waltonfuture made their first contribution in #313
- @Gabesarch made their first contribution in #325
- @CSfufu made their first contribution in #330
- @tongxiao2002 made their first contribution in #318
- @kxfan2002 made their first contribution in #331
- @YutingLi0606 made their first contribution in #337
- @gdw439 made their first contribution in #350
- @cyc00518 made their first contribution in #359
Full Changelog: v0.3.0...v0.3.1