Lists (11)
Sort Name ascending (A-Z)
Starred repositories
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with …
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
The LLM's practical guide: From the fundamentals to deploying advanced LLM and RAG apps to AWS using LLMOps best practices
OpenMMLab Model Compression Toolbox and Benchmark.
Kubernetes Handbook (Kubernetes指南) https://kubernetes.feisky.xyz
Automatically cordon and drain Kubernetes nodes based on node conditions
Standardized Serverless ML Inference Platform on Kubernetes
Manage any layer-7 protocols in a Service Mesh.
Envoy ext-proc gRPC filter usage demo with golang
Cloud-native high-performance edge/middle/service proxy
Sample implementation for Envoy External Processing (ext_proc).
Repository for the next iteration of composite service (e.g. Ingress) and load balancing APIs.
LLMPerf is a library for validating and benchmarking LLMs
Composable building blocks to build Llama Apps
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
A lightweight framework for building LLM-based agents
一个简单易用的工具,帮助您重置 Cursor IDE 的机器ID。无任何依赖。支持 Windows、macOS 和 Linux。无限试用。
A cloud-native open-source unified multi-cloud and hybrid-cloud platform. 开源、云原生的多云管理及混合云融合平台
Envoy AI Gateway is an open source project for using Envoy Gateway to handle request traffic from application clients to Generative AI services.