8000 wyxscir / Starred · GitHub

More Web Proxy on the site http://driver.im/

wyxscir

Follow

🍒

wyxscir

🍒

Follow

wangyuxin1025@163.com

8 followers · 45 following

beijing

Lists (4)

Sort

efficient

14 repositories

largemodel

65 repositories

papercode

tools

Stars

HiThink-Research / BizFinBench

A Business-Driven Real-World Financial Benchmark for Evaluating LLMs

Python 180 4 Updated May 30, 2025

QingyangZhang / Label-Free-RLVR

149 3 Updated Jun 4, 2025

jennyzzt / dgm

Darwin Gödel Machine: Open-Ended Evolution of Self-Improving Agents

Python 911 195 Updated May 30, 2025

alibaba / ROLL

An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models

Python 596 36 Updated Jun 4, 2025

Paper2Poster / Paper2Poster

Open-source Multi-agent Poster Generation from Papers

Python 1,682 71 Updated Jun 4, 2025

ruixin31 / Rethink_RLVR

Python 235 14 Updated May 27, 2025

sunblaze-ucb / Intuitor

Code for the paper: "Learning to Reason without External Rewards"

Python 243 21 Updated Jun 4, 2025

MiniMax-AI / SynLogic

The official repo of SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond

Python 102 4 Updated Jun 3, 2025

EffiVLM-Bench / EffiVLM-Bench

Python 8 Updated Jun 3, 2025

Kelaxon / SSR-Zero

Code for paper "SSR-Zero: Simple Self-Rewarding Reinforcement Learning for Machine Translation"

5 Updated May 28, 2025

laude-institute / terminal-bench

A benchmark for LLMs on complicated tasks in the terminal

Shell 142 26 Updated Jun 4, 2025

Gen-Verse / MMaDA

MMaDA - Open-Sourced Multimodal Large Diffusion Language Models

Python 961 43 Updated Jun 4, 2025 BD5A

zchuz / SiGIR-MHQA

[ACL 2025 Findings] Self-Critique Guided Iterative Reasoning for Multi-hop Question Answering

Python 2 Updated May 21, 2025

zhaohongxuan / obsidian-weread-plugin

Obsidian Weread Plugin is a plugin to sync Weread(微信读书) hightlights and annotations into your Obsidian Vault.

TypeScript 1,444 85 Updated May 6, 2025

RLHFlow / Minimal-RL

Python 196 9 Updated May 14, 2025

QwenLM / ParScale

Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling

Python 374 15 Updated May 17, 2025

LoverLost / EffiVLM-Bench

Python 1 Updated May 19, 2025

XiaomiMiMo / MiMo

MiMo: Unlocking the Reasoning Potential of Language Model – From Pretraining to Posttraining

Python 1,405 58 Updated May 30, 2025

ElliottYan / LUFFY

Official Repository of "Learning to Reason under Off-Policy Guidance"

Python 211 22 Updated Jun 3, 2025

LeapLabTHU / Absolute-Zero-Reasoner

Official Repository of Absolute Zero Reasoner

Python 1,450 240 Updated Jun 2, 2025

Alibaba-NLP / ZeroSearch

ZeroSearch: Incentivize the Search Capability of LLMs without Searching

Python 958 89 Updated Jun 1, 2025

yfzhang114 / r1_reward

✨✨R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning

Python 136 7 Updated May 9, 2025

RLHFlow / GVM

Python 10 Updated May 7, 2025

ByteDance-Seed / Seed-Coder

Seed-Coder is a family of lightweight open-source code LLMs comprising base, instruct and reasoning models, developed by ByteDance Seed.

482 34 Updated May 15, 2025

ypwang61 / One-Shot-RLVR

official repository for “Reinforcement Learning for Reasoning in Large Language Models with One Training Example”

Python 266 20 Updated Jun 2, 2025

HITsz-TMG / YiZhao

YiZhao: A 2TB Open Financial Corpus. Data and tools for generating and inspecting YiZhao, a safe, high-quality, open-source bilingual financial corpus (Chinese and English).

Python 26 3 Updated Dec 12, 2024

idea-iitd / graphgen

GraphGen: A Scalable Approach to Domain-agnostic Labeled Graph Generation

C++ 58 16 Updated Jul 6, 2023

zhaochenyang20 / Awesome-ML-SYS-Tutorial

My learning notes/codes for ML SYS.

Python 2,389 149 Updated Jun 4, 2025

LeapLabTHU / limit-of-RLVR

repo for paper https://arxiv.org/abs/2504.13837

Python 144 7 Updated May 24, 2025

PRIME-RL / TTRL

TTRL: Test-Time Reinforcement Learning

Python 587 43 Updated May 23, 2025

0