Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Stars
Dense Passage Retriever - is a set of tools and models for open domain Q&A task.
Repository for Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions, ACL23
Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and…
[ICLR'25] Official code for the paper 'MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs'
Conversion between Traditional and Simplified Chinese
Ongoing research training transformer models at scale
FlashMLA: Efficient MLA decoding kernels
Official Repo for Open-Reasoner-Zero
Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)
Github repository for "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models"
Can Knowledge Editing Really Correct Hallucinations? (ICLR 2025)
Library for Knowledge Intensive Language Tasks
A high-throughput and memory-efficient inference and serving engine for LLMs
Large-scale, Informative, and Diverse Multi-round Chat Data (and Models)
EMNLP'2021: Simple Entity-centric Questions Challenge Dense Retrievers https://arxiv.org/abs/2109.08535
An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
Fact-Checking the Output of Generative Large Language Models in both Annotation and Evaluation.
Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
🧑🚀 全世界最好的LLM资料总结(视频生成、Agent、辅助编程、数据处理、模型训练、模型推理、o1 模型、MCP、小语言模型、视觉语言模型) | Summary of the world's best LLM resources.
Official repo for paper DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning.
Fast and memory-efficient exact attention
High Accuracy and efficiency multi-task fine-tuning framework for Code LLMs. This work has been accepted by KDD 2024.
[CCS'24] A dataset consists of 15,140 ChatGPT prompts from Reddit, Discord, websites, and open-source datasets (including 1,405 jailbreak prompts).
[ACL 2024] ANAH & [NeurIPS 2024] ANAH-v2 & [ICLR 2025] Mask-DPO