yyht

yyht

17 followers · 6 following

Achievements

slime Public
Forked from THUDM/slime

slime is a LLM post-training framework aiming at scaling RL.

Python Apache License 2.0 Updated Jun 20, 2025
openrlhf_async_pipline Public

Python 55 2 Apache License 2.0 Updated Jun 17, 2025
TreeRL Public
Forked from THUDM/TreeRL

TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25

Python Apache License 2.0 Updated Jun 16, 2025
SWE-bench-Live Public
Forked from microsoft/SWE-bench-Live

🚀 SWE-bench Goes Live!

Python MIT License Updated May 30, 2025
ROLL Public
Forked from alibaba/ROLL

Python Apache License 2.0 Updated May 30, 2025
RedTeamCUA Public
Forked from OSU-NLP-Group/RedTeamCUA

RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS Environments

Python Apache License 2.0 Updated May 29, 2025
EvoAgentX Public
Forked from EvoAgentX/EvoAgentX

🚀 EvoAgentX: Building a Self-Evolving Ecosystem of AI Agents

Python Other Updated May 28, 2025
SynLogic Public
Forked from MiniMax-AI/SynLogic

The official repo of SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond

Python MIT License Updated May 28, 2025
deer-flow Public
Forked from bytedance/deer-flow

DeerFlow is a community-driven Deep Research framework, combining language models with tools like web search, crawling, and Python execution, while contributing back to the open-source community.

TypeScript MIT License Updated May 28, 2025
agent-distillation Public
Forked from Nardien/agent-distillation

Python Apache License 2.0 Updated May 26, 2025
One-RL-to-See-Them-All Public
Forked from MiniMax-AI/One-RL-to-See-Them-All

One RL to See Them All: Visual Triple Unified Reinforcement Learning

MIT License Updated May 25, 2025
InternBootcamp Public
Forked from InternLM/InternBootcamp

Python Apache License 2.0 Updated May 23, 2025
MathQ-Verify Public
Forked from scuuy/MathQ-Verify

Apache License 2.0 Updated May 21, 2025
DanceGRPO Public
Forked from XueZeyue/DanceGRPO

Updated May 12, 2025
WebOrganizer Public
Forked from CodeCreator/WebOrganizer

Organize the Web: Constructing Domains Enhances Pre-Training Data Curation

Jupyter Notebook Apache License 2.0 Updated May 2, 2025
openrlhf-async Public

Python 3 Updated Apr 24, 2025
MM-EUREKA Public
Forked from ModalMinds/MM-EUREKA

MM-EUREKA: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning

Python Apache License 2.0 Updated Mar 8, 2025
AppAgentX Public
Forked from Westlake-AGI-Lab/AppAgentX

Official implementation of AppAgentX: Evolving GUI Agents as Proficient Smartphone Users

Python Updated Mar 6, 2025
PromptCoT Public
Forked from inclusionAI/PromptCoT

Python MIT License Updated Mar 5, 2025
kodcode Public
Forked from KodCode-AI/kodcode

Generate diverse coding questions and verifiable solutions - all in one framework

Python Apache License 2.0 Updated Mar 5, 2025
RAGEN Public
Forked from RAGEN-AI/RAGEN

RAGEN is the first open-source reproduction of DeepSeek-R1 on AGENT training.

Python Apache License 2.0 Updated Feb 6, 2025
demystify-long-cot Public
Forked from eddycmu/demystify-long-cot

Python MIT License Updated Feb 5, 2025
SRA-MCTS Public
Forked from DIRECT-BIT/SRA-MCTS

Python Apache License 2.0 Updated Nov 27, 2024
WorfBench Public
Forked from zjunlp/WorfBench

[ICLR 2025] Benchmarking Agentic Workflow Generation

Python MIT License Updated Nov 26, 2024
VinePPO Public
Forked from McGill-NLP/VinePPO

Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"

Python MIT License Updated Oct 3, 2024
show-me Public
Forked from mrm8488/show-me

A visual and transparent alternative to open-source ChatGPT O1

Python Updated Sep 26, 2024
ell Public
Forked from MadcowD/ell

A language model programming library.

Python MIT License Updated Sep 23, 2024
LeanRL Public
Forked from pytorch-labs/LeanRL

LeanRL is a fork of CleanRL, where selected PyTorch scripts optimized for performance using compile and cudagraphs.

Python Other Updated Sep 20, 2024
LLM-Engines Public
Forked from jdf-prog/LLM-Engines

Python MIT License Updated Sep 11, 2024
rStar Public
Forked from zhentingqi/rStar

Python MIT License Updated Sep 10, 2024

yyht

Achievements

Achievements

slime Public

Uh oh!

openrlhf_async_pipline Public

Uh oh!

TreeRL Public

Uh oh!

SWE-bench-Live Public

Uh oh!

ROLL Public

Uh oh!

RedTeamCUA Public

Uh oh!

EvoAgentX Public

Uh oh!

SynLogic Public

Uh oh!

deer-flow Public

Uh oh!

agent-distillation Public

Uh oh!

One-RL-to-See-Them-All Public

Uh oh!

InternBootcamp Public

Uh oh!

MathQ-Verify Public

Uh oh!

DanceGRPO Public

Uh oh!

WebOrganizer Public

Uh oh!

openrlhf-async Public

Uh oh!

MM-EUREKA Public

Uh oh!

AppAgentX Public

Uh oh!

PromptCoT Public

Uh oh!

kodcode Public

Uh oh!

RAGEN Public

Uh oh!

demystify-long-cot Public

Uh oh!

SRA-MCTS Public

Uh oh!

WorfBench Public

Uh oh!

VinePPO Public

Uh oh!

show-me Public

Uh oh!

ell Public

Uh oh!

LeanRL Public

Uh oh!

LLM-Engines Public

Uh oh!

rStar Public

Uh oh!