Stars
Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3.
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
目前已囊括232个大模型,覆盖chatgpt、gpt-4o、o3-mini、谷歌gemini、Claude3.5、智谱GLM-Zero、文心一言、qwen-max、百川、讯飞星火、商汤senseChat、minimax等商用模型, 以及DeepSeek-R1、qwq-32b、deepseek-v3、qwen2.5、llama3.3、phi-4、glm4、gemma3、mistral、书生in…
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…
[ICML 2024 Spotlight] Differentially Private Synthetic Data via Foundation Model APIs 2: Text
A self-learning tutorail for CUDA High Performance Programing.
Finetune Qwen3, Llama 4, TTS, DeepSeek-R1 & Gemma 3 LLMs 2x faster with 70% less memory! 🦥
Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL
A high-throughput and memory-efficient inference and serving engine for LLMs
The official code for NAACL 2024 paper: $E^5$: Zero-shot Hierarchical Table Analysis using Augmented LLMs via Explain, Extract, Execute, Exhibit and Extrapolate
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
基于PageRank的TextRank方法, 可以应用于中文关键词、短语、摘要提取程序,代码使用Scala编写。
Text Content Grapher based on keyinfo extraction by NLP method。输入一篇文档,将文档进行关键信息提取,进行结构化,并最终组织成图谱组织形式,形成对文章语义信息的图谱化展示。
Build large-scale task workflows: luigi + job submission + remote targets + environment sandboxing using Docker/Singularity
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬
An awesome repository & A comprehensive survey on interpretability of LLM attention heads.
An NVIDIA AI Workbench example project for fine-tuning a Nemotron-3 8B model
[ACL 2024] An Easy-to-use Instruction Processing Framework for LLMs.
Rank-DistiLLM: Closing the Effectiveness Gap Between Cross-Encoders and LLMs for Passage Re-Ranking
T2Ranking: A large-scale Chinese benchmark for passage ranking.
A unified Natural Language Understanding reranker with deep reinforcement learning
This codebase is based on OLTR codebase
allRank is a framework for training learning-to-rank neural models based on PyTorch.
A deep reinforcement learning approach to search engine ranking (PyTorch). Final Project for UC Berkeley's CS 285: Deep Reinforcement Learning, Decision Making, and Control