yyht

yyht

17 followers · 6 following

Achievements

Stars

SkyworkAI / MindLink

Python 48 3 Updated Jun 25, 2025

xfey / MCP-Zero

MCP-Zero: Active Tool Discovery for Autonomous LLM Agents

Python 95 5 Updated Jun 25, 2025

THUDM / slime

slime is a LLM post-training framework aiming at scaling RL.

Python 445 19 Updated Jun 25, 2025

pytorch-labs / LeanRL

LeanRL is a fork of CleanRL, where selected PyTorch scripts optimized for performance using compile and cudagraphs.

Python 598 26 Updated Oct 26, 2024

THUDM / TreeRL

TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25

Python 32 2 Updated Jun 16, 2025

mingyin1 / Agents_Failure_Attribution

ICML 2025 Spotlight

Python 206 11 Updated Jun 21, 2025

MiniMax-AI / SynLogic

The official repo of SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond

Python 145 13 Updated Jun 3, 2025

MiniMax-AI / One-RL-to-See-Them-All

The official repo of One RL to See Them All: Visual Triple Unified Reinforcement Learning

Python 278 12 Updated May 31, 2025

InternLM / InternBootcamp

Python 156 18 Updated Jun 19, 2025

ByteDance-Seed / Seed1.5-VL

Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.

Jupyter Notebook 1,261 47 Updated Jun 14, 2025

Charleo85 / pyhtmm

A Python implementation of Hidden Topic Markov Model

Python 15 6 Updated May 6, 2018

trycua / cua

c/ua is the Docker Container for Computer-Use AI Agents.

Python 8,779 392 Updated Jun 24, 2025

yyht / openrlhf_async_pipline

Python 55 2 Updated Jun 17, 2025

LeapLabTHU / Absolute-Zero-Reasoner

Official Repository of Absolute Zero Reasoner

Python 1,553 264 Updated Jun 2, 2025

Westlake-AGI-Lab / AppAgentX

Official implementation of AppAgentX: Evolving GUI Agents as Proficient Smartphone Users

Python 445 57 Updated Apr 15, 2025

ModalMinds / MM-EUREKA

MM-EUREKA: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning

Python 670 23 Updated Jun 25, 2025

inclusionAI / PromptCoT

A unified suite for generating elite reasoning problems and training high-performance LLMs, including pioneering attention-free architectures

Python 58 3 Updated Jun 2, 2025

zzli2022 / Awesome-System2-Reasoning-LLM

Latest Advances on System-2 Reasoning

Python 1,137 57 Updated Jun 8, 2025

PromtEngineer / Agent-0

This project is a **proof of concept** that aims to replicate the reasoning capabilities of OpenAI's newly released O1 model.

Python 87 21 Updated Jan 26, 2025

deep-spin / quest-decoding

A package for sampling from Gibbs distributions during inference with LLMs.

Python 8 2 Updated Jun 12, 2025

X-PLUG / ChatPLUG

A Chinese Open-Domain Dialogue System

Python 321 27 Updated Aug 16, 2023

deepspeedai / DeepSpeedExamples

Example models using DeepSpeed

Python 6,541 1,094 Updated Jun 21, 2025

yyht / flexible-clustering

Forked from matteodellamico/flexible-clustering

Clustering for arbitrary data and dissimilarity function

Python 1 Updated Dec 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

yyht

Achievements

Achievements

Block or report yyht

Stars

SkyworkAI / MindLink

xfey / MCP-Zero

THUDM / slime

pytorch-labs / LeanRL

THUDM / TreeRL

mingyin1 / Agents_Failure_Attribution

MiniMax-AI / SynLogic

MiniMax-AI / One-RL-to-See-Them-All

InternLM / InternBootcamp

ByteDance-Seed / Seed1.5-VL

Charleo85 / pyhtmm

trycua / cua

yyht / openrlhf_async_pipline

LeapLabTHU / Absolute-Zero-Reasoner

Westlake-AGI-Lab / AppAgentX

ModalMinds / MM-EUREKA

inclusionAI / PromptCoT

zzli2022 / Awesome-System2-Reasoning-LLM

PromtEngineer / Agent-0

deep-spin / quest-decoding

X-PLUG / ChatPLUG

deepspeedai / DeepSpeedExamples

yyht / flexible-clustering

Hello-SimpleAI / chatgpt-comparison-detection

AAIG-NLP / UniIE

XiaoMi / C3KG

thu-coai / OPD

bitsandbytes-foundation / bitsandbytes

code-kern-ai / refinery

microsoft / Semi-supervised-learning