8000 zhusq20 (Siqi Zhu) / Starred · GitHub

More Web Proxy on the site http://driver.im/

zhusq20

Follow

🏡

Working from dormitory

Siqi Zhu zhusq20

🏡

Working from dormitory

Follow

34 followers · 276 following

Beijing

Highlights

Pro

Lists (4)

Sort

agentserve

work in progress

homework

kernel

lm reasoning

26 repositories

Stars

bigcode-project / bigcodebench

[ICLR'25] BigCodeBench: Benchmarking Code Generation Towards AGI

Python 358 43 Updated Apr 11, 2025

eddiegaoo / Apt-Serve

Python 5 1 Updated Apr 12, 2025

dilab-zju / self-speculative-decoding

Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**

Jupyter Notebook 186 12 Updated Feb 13, 2025

mit-han-lab / Quest

[ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference

Cuda 282 30 Updated Nov 22, 2024

ByteDance-Seed / ShadowKV

[ICML 2025 Spotlight] ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference

Python 178 12 Updated May 1, 2025

huggingface / Math-Verify

Python 688 28 Updated Apr 28, 2025

FasterDecoding / TEAL

Python 129 8 Updated Feb 15, 2025

liuxukun2000 / Adaptix

Adaptive Draft-Verification for Efficient Large Language Model Decoding (AAAI 2025 Oral)

Python 66 5 Updated Apr 1, 2025

facebookresearch / LayerSkip

Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024

Python 294 25 Updated May 3, 2025

SUFE-AIFLM-Lab / Fin-R1

587 68 Updated Mar 27, 2025

ByteDance-Seed / Triton-distributed

Distributed Triton for Parallel Systems

Python 687 42 Updated May 12, 2025

THUDM / WebRL

Building Open LLM Web Agents with Self-Evolving Online Curriculum RL

Python 377 27 Updated Apr 30, 2025

web-arena-x / webarena

Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"

Python 988 149 Updated Feb 7, 2025

RUC-NLPIR / FlashRAG

⚡FlashRAG: A Python Toolkit for Efficient RAG Research (WWW2025 Resource)

Python 2,276 199 Updated May 12, 2025

NousResearch / atropos

Atropos is a Language Model Reinforcement Learning Environments fr B29E amework for collecting and evaluating LLM trajectories through diverse environments

Python 336 24 Updated May 13, 2025

theworldofagents / Agentic-Reasoning

free and open OpenAI Deep Research

Python 549 77 Updated Feb 18, 2025

xinzhel / LLM-Agent-Survey

Survey on LLM Agents (Published on CoLing 2025)

258 13 Updated May 6, 2025

XuehaiPan / nvitop

An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.

Python 5,523 169 Updated May 13, 2025

SalesforceAIResearch / xLAM

xLAM: A Family of Large Action Models to Empower AI Agent Systems

Python 428 33 Updated May 12, 2025

bytedance / SandboxFusion

Python 270 18 Updated Feb 7, 2025

heshengtao / comfyui_LLM_party

LLM Agent Framework in ComfyUI includes MCP sever, Omost,GPT-sovits, ChatTTS,GOT-OCR2.0, and FLUX prompt nodes,access to Feishu,discord,and adapts to all llms with similar openai / aisuite interfac…

Python 1,669 141 Updated May 10, 2025

Agent-RL / ReCall

ReCall: Learning to Reason with Tool Call for LLMs via Reinforcement Learning

Python 845 58 Updated Apr 30, 2025

tsinghua-fib-lab / AgentSociety

AgentSociety: Large-scale Social Simulation to Understand Human Behaviors and Society through LLM-driven Agents

Python 279 46 Updated Apr 28, 2025

opendilab / awesome-RLHF

A curated list of reinforcement learning with human feedback resources (continually updated)

3,933 239 Updated Apr 30, 2025

Parallel-Reasoning / APR

Code for Paper: Learning Adaptive Parallel Reasoning with Language Models

Python 81 4 Updated Apr 23, 2025

sail-sg / FlowReasoner

Python 108 6 Updated May 6, 2025

ByteDance-Seed / FlexPrefill

Code for paper: [ICLR2025 Oral] FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference

Python 101 5 Updated Apr 17, 2025

SqueezeAILab / LLMCompiler

[ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling

Python 1,683 123 Updated Jul 10, 2024

ElliottYan / LUFFY

Official Repository of "Learning to Reason under Off-Policy Guidance"

Python 173 16 Updated May 8, 2025

facebookresearch / sweet_rl

Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks

Python 192 9 Updated May 5, 2025

0