8000 Yueeeeeeee (Zhenrui) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View Yueeeeeeee's full-sized avatar
🌏
Working from home
🌏
Working from home

Block or report Yueeeeeeee

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Python pdb for multiple processes

Python 48 6 Updated May 24, 2025

Code for Paper: Learning Adaptive Parallel Reasoning with Language Models

Python 98 5 Updated Apr 23, 2025

Official PyTorch implementation for "Large Language Diffusion Models"

Python 2,227 147 Updated Jun 2, 2025

RL Tango: Reinforcing Generator and Verifier Together for Language Reasoning

23 Updated May 27, 2025

Hybrid Latent Reasoning via Reinforcement Learning

Python 76 18 Updated May 27, 2025

AllenAI's post-training codebase

Python 3,003 400 Updated Jun 10, 2025

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 21,958 1,467 Updated Jun 9, 2025

Tina: Tiny Reasoning Models via LoRA

Python 254 30 Updated May 29, 2025

A Survey on Multimodal Retrieval-Augmented Generation

213 9 Updated Jun 3, 2025

Go ahead and axolotl questions

Python 9,560 1,036 Updated Jun 10, 2025

Simple RL training for reasoning

Python 3,619 271 Updated Apr 10, 2025

[SIGIR 2025] The official repo for "Scaling Sparse and Dense Retrieval in Decoder-Only LLMs"

Python 15 1 Updated Mar 31, 2025

Atom of Thoughts for Markov LLM Test-Time Scaling

Python 574 48 Updated May 28, 2025

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 2,549 183 Updated Jun 6, 2025

The official implementation of Self-Play Preference Optimization (SPPO)

Python 565 46 Updated Jan 23, 2025

Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent

Python 333 22 Updated Apr 22, 2025

Official codebase for "Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling".

Python 262 21 Updated Feb 19, 2025
Python 564 50 Updated Apr 15, 2025

Code for Heima

Python 45 3 Updated Apr 21, 2025

[ICLR 2025] InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales

Python 97 6 Updated Feb 6, 2025

Textbook on reinforcement learning from human feedback

TeX 1,003 84 Updated Jun 10, 2025

Minimal reproduction of DeepSeek R1-Zero

Python 11,883 1,489 Updated Apr 24, 2025

Codes for Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models

Jupyter Notebook 227 12 Updated Oct 28, 2024

Medical o1, Towards medical complex reasoning with LLMs

Python 1,127 112 Updated Jan 20, 2025

Search-o1: Agentic Search-Enhanced Large Reasoning Models

Python 910 88 Updated May 13, 2025

Code for BLT research paper

Python 1,679 141 Updated May 22, 2025

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & RFT & Dynamic Sampling & Async Agent RL)

Python 7,027 683 Updated Jun 10, 2025

AirLLM 70B inference with single 4GB GPU

Jupyter Notebook 5,788 458 Updated May 6, 2025

This repository includes the official implementation of OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs.

Python 693 70 Updated Apr 13, 2025
Next
0