Stars
Fully open reproduction of DeepSeek-R1
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents
Reference implementation for Token-level Direct Preference Optimization(TDPO)
[NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other models
[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward
Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training
[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Type less, code more: Cody is an AI code assistant that uses advanced search and codebase context to help you write and fix code.
Qwen2.5-Coder is the code version of Qwen2.5, the large language model series developed by Qwen team, Alibaba Cloud.
SGLang is a fast serving framework for large language models and vision language models.
official repository of aiXcoder-7B Code Large Language Model
SoTA LLM for converting natural language questions to SQL queries
Ongoing research training transformer models at scale
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…
An incremental parsing system for programming tools
📺 Discover the latest machine learning / AI courses on YouTube.
Machine Learning Yearning 中文版 - 《机器学习训练秘籍》 - Andrew Ng 著
SuperCLUE: 中文通用大模型综合性基准 | A Benchmark for Foundation Models in Chinese
DALL·E Mini - Generate images from a text prompt
Retrieval and Retrieval-augmented LLMs
DeepSeek Coder: Let the Code Write Itself
本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)
GUI for ChatGPT API and many LLMs. Supports agents, file-based QA, GPT finetuning and query with web search. All with a neat UI.