Stars
The papers are organized according to our survey: Evaluating Large Language Models: A Comprehensive Survey.
List of papers on hallucination detection in LLMs.
Efficient Triton Kernels for LLM Training
An AI web browsing framework focused on simplicity and extensibility.
DeepRetrieval - Hacking 🔥Real Search Engines and Retrievers with LLM via RL
Mulberry, an o1-like Reasoning and Reflection MLLM Implemented via Collective MCTS
[ICLR 2025] Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing. Your efficient and high-quality synthetic data generation pipeline!
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & LoRA & vLLM & RFT)
An Open Large Reasoning Model for Real-World Solutions
Moxin is a family of fully open-source and reproducible LLMs
Original source code The Art of Reinforcement Learning by Michael Hu
90% of what you need for LLM app development. Nothing you don't.
Retrieval and Retrieval-augmented LLMs
Awesome-LLM-RAG: a curated list of advanced retrieval augmented generation (RAG) in Large Language Models
A framework for serving and evaluating LLM routers - save LLM costs without compromising quality
TextGrad: Automatic ''Differentiation'' via Text -- using large language models to backpropagate textual gradients.
This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
Implementing the 4 agentic patterns from scratch
FinRL®: Financial Reinforcement Learning. 🔥
Controllable Text Generation for Large Language Models: A Survey
Build Better Websites. Create modern, resilient user experiences with web fundamentals.
Streamlines and simplifies prompt design for both developers and non-technical users with a low code approach.