More
Stars
Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.
Convert PDF to markdown + JSON quickly with high accuracy
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with …
🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.
官方推荐的 ChatTTS 资源汇总项目,整理了全网相关资源和常见问题 || Officially recommended ChatTTS resource collection project
Infinity is a high-throughput, low-latency serving engine for text-embeddings, reranking models, clip, clap and colpali
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
A high-throughput and memory-efficient inference and serving engine for LLMs
llama3 implementation one matrix multiplication at a time
OCR, layout analysis, reading order, table recognition in 90+ languages
A Python library to extract tabular data from PDFs
Production-ready platform for agentic workflow development.
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Ovis2, InternVL3, Llava, GLM4…
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
LLM API 管理 & 分发系统,支持 OpenAI、Azure、Anthropic Claude、Google Gemini、DeepSeek、字节豆包、ChatGLM、文心一言、讯飞星火、通义千问、360 智脑、腾讯混元等主流模型,统一 API 适配,可用于 key 管理与二次分发。单可执行文件,提供 Docker 镜像,一键部署,开箱即用。LLM API management & k…
🤖 The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transf…
💬 MaxKB is an open-source AI assistant for enterprise. It seamlessly integrates RAG pipelines, supports robust workflows, and provides MCP tool-use capabilities.
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
This repo includes ChatGPT prompt curation to use ChatGPT and other LLM tools better.
Joplin - the privacy-focused note t 3AB4 aking app with sync capabilities for Windows, macOS, Linux, Android and iOS.
Fork of turndown-plugin-gfm for Jopin