Releases: hiyouga/EasyR1
Releases · hiyouga/EasyR1
v0.3.1: Multi-modal DAPO
What's Changed
- [example] fix runtime env by @hiyouga in #224
- Update Awesome Work using EasyR1 by @RainBowLuoCS in #240
- Update Awesome Work using EasyR1 by @xyliugo in #239
- [trainer] support async reward by @hiyouga in #252
- [readme] add baselines by @hiyouga in #253
- [script] fix merge script by @hiyouga in #254
- [misc] update baselines & docker image by @hiyouga in #256
- [readme] update baseline by @hiyouga in #258
- [data] support custom chat template by @hiyouga in #270
- [reward] support batch reward by @hiyouga in #271
- [example] change env vars by @hiyouga in #272
- [Readme] Add awesome work using EasyR1 by @Wangbiao2 in #273
- [model] add qwen3 support by @hiyouga in #276
- [example] update script by @hiyouga in #277
- [readme] update wechat by @hiyouga in #280
- [misc] fix logger by @hiyouga in #288
- [readme] update wechat by @hiyouga in #292
- add get model from modelscope by @Saigyouji-Yuyuko1000 in #297
- [readme] update wechat by @hiyouga in #301
- Add a new work based on EasyR1 by @LiuRicky in #303
- add new work based on EasyR1 by @waltonfuture in #313
- [logger] fix tensorboard by @hiyouga in #316
- Add a new work based on EasyR1 by @Gabesarch in #325
- [misc] fix console hanging by @hiyouga in #293
- [misc] several update by @hiyouga in #329
- Update README.md by @CSfufu in #330
- [perf] pass raw image data between workers by @tongxiao2002 in #318
- [readme] add our work using EasyR1 by @kxfan2002 in #331
- add our work using EasyR1 by @YutingLi0606 in #337
- [data] fix position ids for qwen2vl mrope & add test by @hiyouga in #339
- [worker] colocate actor and ref model by @hiyouga in #342
- [trainer] save best checkpoint by @hiyouga in #343
- [trainer] fix bug by @hiyouga in #344
- [utils] update data protocol by @hiyouga in #345
- [trainer] repeat rollout and prepare filter by @hiyouga in #346
- [worker] expose rollout manager by @hiyouga in #347
- [worker] fix vllm sharding manager by @hiyouga in #348
- fix: bug by @gdw439 in #350
- [trainer] fix progress bar by @hiyouga in #355
- [readme] update docker image by @hiyouga in #357
- [trainer] add online filtering by @Saigyouji-Yuyuko1000 in #358
- [worker] update reward manager by @hiyouga in #360
- Fix/vllm processor cache for text only model by @cyc00518 in #359
- [breaking] support text-image mixed data by @hiyouga in #361
- [model] fix qwen2vl bug by @hiyouga in #363
- [tracking] add tensorboard exp name by @hiyouga in #365
- [worker] do not load ref if kl is disabled by @hiyouga in #366
- [worker] fix skip ref model by @hiyouga in #367
- [examples] add qwen3_14b_dapo17k_dapo by @Saigyouji-Yuyuko1000 in #369
- [release] 0.3.1 by @hiyouga in #370
New Contributors
- @RainBowLuoCS made their first contribution in #240
- @xyliugo made their first contribution in #239
- @Saigyouji-Yuyuko1000 made their first contribution in #297
- @waltonfuture made their first contribution in #313
- @Gabesarch made their first contribution in #325
- @CSfufu made their first contribution in #330
- @tongxiao2002 made their first contribution in #318
- @kxfan2002 made their first contribution in #331
- @YutingLi0606 made their first contribution in #337
- @gdw439 made their first contribution in #350
- @cyc00518 made their first contribution in #359
Full Changelog: v0.3.0...v0.3.1
v0.3.0
What's Changed
- update readme by @hiyouga in #4
- [readme] update readme by @hiyouga in #5
- [worker] fix small models by @hiyouga in #14
- feat: swanlab examples by @Zeyi-Lin in #13
- [example] add ReMax support by @Shenzhi-Wang in #20
- fix:vllm length by @AL-377 in #18
- fix: math reward fn by @yueyang130 in #26
- [readme] update readme by @hiyouga in #29
- Fix template issue by @wzq016 in #31
- [example] fix length by @hiyouga in #32
- [readme] update hardware requirement by @hiyouga in #33
- [worker] fix model attn init by @hiyouga in #37
- Witness the Aha Moment on Counting Task by @BUAADreamer in #38
- [example] fix clevr example by @hiyouga in #47
- Fix: save processor for VLMs by @wzq016 in #48
- [perf] support padding-free training for VLMs by @hiyouga in #61
- [readme] update readme by @hiyouga in #62
- [readme] add fig explain by @hiyouga in #64
- [readme] update fig by @hiyouga in #65
- [trainer] support resume ckpt by @hiyouga in #66
- [config] update default config by @hiyouga in #68
- [readme] update wechat by @hiyouga in #71
- [env] fix memory leak & enable vLLM v1 by @hiyouga in #73
- [readme] update readme by @hiyouga in #75
- [readme] update readme by @hiyouga in #80
- Add new baseline GeoQA8k from R1V by @chenllliang in #86
- [feat] support freeze vision tower by @hiyouga in #99
- [config] increase prompt length by @hiyouga in #100
- update readme - add ## Awesome Work using EasyR1 by @LengSicong in #101
- Add the work Vision-R1 that uses EasyR1 by @Osilly in #102
- fix:OOM by @dirtyDan0 in #111
- [trainer] verify arg by @hiyouga in #112
- [misc] sync feat from upstream by @hiyouga in #113
- [misc] clean some code by @hiyouga in #114
- [example] add examples by @hiyouga in #118
- [checkpoint] fix load checkpoint by @hiyouga in #119
- [trainer] gather metrics by @hiyouga in #120
- [misc] add doc string by @hiyouga in #121
- Add seg zero to README by @LiuRicky in #122
- Update README.md by @PzySeere in #124
- fix readme by @hiyouga in #127
- [core] remove entropy loss by @hiyouga in #132
- [trainer] support val sampling by @hiyouga in #133
- misc: save at the last step by @dirtyDan0 in #138
- feat: swanlab add
easyr1
andverl
config by @Zeyi-Lin in #140 - [version] upgrade vllm to 0.8 by @hiyouga in #143
- [readme] update docker file by @hiyouga in #146
- [readme] update wechat by @hiyouga in #147
- [readme] update dockerfile by @hiyouga in #148
- Update requirements.txt for multinode by @chenllliang in #154
- [trainer] support channel-wise reward by @hiyouga in #155
- Update README.md by @PzySeere in #157
- [trainer] support save limit & fix oom issue by @hiyouga in #158
- [misc] update docker files by @hiyouga in #162
- [trainer] support 32b by @hiyouga in #164
- [data] use hf-native template by @hiyouga in #165
- [misc] fix dataset by @hiyouga in #166
- [readme] update tutorial by @hiyouga in #167
- [tracking] add tensorboard by @hiyouga in #170
- [misc] support adamw bf16 by @hiyouga in #171
- [misc] fix config by @hiyouga in #172
- [misc] fix metrics by @hiyouga in #173
- [misc] refactor val gen log by @hiyouga in #174
- update Awesome Work using EasyR1 by @appletea233 in #179
- [misc] fix masked mean by @hiyouga in #181
- [misc] algo improvement by @hiyouga in #184
- [misc] minor update by @hiyouga in #188
- [fix] arg check by @hiyouga in #189
- [bugfix] fix vllm 0.8.3 rollout by @hiyouga in #197
- [deps] upgrade to vllm 0.8.3 by @hiyouga in #202
- [core] separate score fn & vllm logit bias by @hiyouga in #204
- Supports loading format prompt from a file by @Wangbiao2 in #208
- [data] update data configs by @hiyouga in #214
- fix: enable user to filter overlong examples in RLHFDataset by @0x404 in #210
- [data] fix rl dataset by @hiyouga in #215
- [misc] lint by @hiyouga in #216
- [data] add multi image dataset by @hiyouga in #217
- [readme] add multi node script by @hiyouga in #218
- [torch] fix saving bf16 optimizer by @hiyouga in #221
- [version] release 0.3.0 by @hiyouga in #222
New Contributors
- @Zeyi-Lin made their first contribution in #13
- @AL-377 made their first contribution in #18
- @yueyang130 made their first contribution in #26
- @wzq016 made their first contribution in #31
- @BUAADreamer made their first contribution in #38
- @chenllliang made their first contribution in #86
- @LengSicong made their first contribution in #101
- @Osilly made their first contribution in #102
- @dirtyDan0 made their first contribution in #111
- @LiuRicky made their first contribution in #122
- @PzySeere made their first contribution in #124
- @appletea233 made their first contribution in #179
- @Wangbiao2 made their first contribution in #208
- @0x404 made their first contribution in #210
Full Changelog: https://github.com/hiyouga/EasyR1/commits/v0.3.0