-
Peking University
-
15:58
(UTC +08:00) - https://jpthu17.github.io/
Stars
ZeroSearch: Incentivize the Search Capability of LLMs without Searching
Scaling Deep Research via Reinforcement Learning in Real-world Environments.
ReCall: Learning to Reason with Tool Call for LLMs via Reinforcement Learning
Awesome curated collection of images and prompts generated by GPT-4o and gpt-image-1. Explore AI generated visuals created with ChatGPT and Sora, showcasing OpenAI’s advanced image generation capab…
📌 [Arxiv2025] Official implementation of "NeuralGS: Bridging Neural Fields and 3D Gaussian Splatting for Compact 3D Representation"
GPT-ImgEval: Evaluating GPT-4o’s state-of-the-art image generation capabilities
V1: Toward Multimodal Reasoning by Designing Auxiliary Task
[🔥updating ...] AI 自动量化交易机器人(完全本地部署) AI-powered Quantitative Investment Research Platform. 📃 online docs: https://ufund-me.github.io/Qbot ✨ :news: qbot-mini: https://github.com/Charmve/iQuant
An Open-source RL System from ByteDance Seed and Tsinghua AIR
Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
Official implementation of UnifiedReward & UnifiedReward-Think
🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"
WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation
Official repository of ’Visual-RFT: Visual Reinforcement Fine-Tuning’
Reasoning in LLMs: Papers and Resources, including Chain-of-Thought, OpenAI o1, and DeepSeek-R1 🍓
Minimal reproduction of DeepSeek R1-Zero
verl: Volcano Engine Reinforcement Learning for LLMs
[arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation
[ICLR 2025][arXiv:2406.07548] Image and Video Tokenization with Binary Spherical Quantization
A jounery to real multimodel R1 ! We are doing on large-scale experiment
A fork to add multimodal model training to open-r1
Official implementation of Next Block Prediction: Video Generation via Semi-Autoregressive Modeling
Fully open reproduction of DeepSeek-R1