Stars
Envoy AI Gateway is an open source project for using Envoy Gateway to handle request traffic from application clients to Generative AI services.
Cost-efficient and pluggable Infrastructure components for GenAI inference
☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!
Arks is a cloud-native inference framework running on Kubernetes
we want to create a repo to illustrate usage of transformers in chinese
《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程
vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization
A high-throughput and memory-efficient inference and serving engine for LLMs