-
兰州大学
- Lanzhou, China
-
16:51
(UTC +08:00)
Lists (2)
Sort Name ascending (A-Z)
Stars
Fully open reproduction of DeepSeek-R1
Fully open data curation for reasoning models
Democratizing Reinforcement Learning for LLMs
800,000 step-level correctness labels on LLM solutions to MATH problems
The official repo for "TheoremQA: A Theorem-driven Question Answering dataset" (EMNLP 2023)
The code and data for "MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark" [NeurIPS 2024]
In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)
Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"
LLM-Check: Investigating Detection of Hallucinations in Large Language Models (NeurIPS 2024)
Source code of DRAGIN, ACL 2024 main conference Long Paper (Oral)
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
Official Code for Oᴘᴇɴ-RAG: Enhanced Retrieval Augmented Reasoning with Open-Source Large Language Models (EMNLP Findings 2024)
[ICML'2024] Can AI Assistants Know What They Don't Know?
Source code of our paper MIND, ACL 2024 Long Paper
Paper Reproduction Google SCoRE(Training Language Models to Self-Correct via Reinforcement Learning)
Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"