-
Microsoft
- San Francisco
- https://ritazh.com
- @ritazzhang
Stars
A TTS model capable of generating ultra-realistic dialogue in one pass.
Model Context Protocol (MCP) server for Kubernetes and OpenShift
Composio equip's your AI agents & LLMs with 100+ high-quality integrations via function calling
Cloud Native Agentic AI | Discord: https://bit.ly/kagentdiscord
mcp-use is the easiest way to interact with mcp servers with custom agents
⚡ Guidance, samples, and tools for HPC workloads on AKS clusters with RDMA and InfiniBand support, including GPUDirect RDMA.
GenAI inference performance benchmarking tool
Constrain, log and scan your MCP connections for security vulnerabilities.
A comprehensive security checklist for MCP-based AI tools. Built by SlowMist to safeguard LLM plugin ecosystems.
An open source DevOps tool for packaging and versioning AI/ML models, datasets, code, and configuration into an OCI artifact.
📦️ A fast, secure MCP server that extends its capabilities through WebAssembly plugins.
A Model Context Protocol (MCP) server for Kubernetes that enables AI assistants like Claude, Cursor, and others to interact with Kubernetes clusters through natural language.
hyperlight-wasm is a rust library crate that enables Wasm Modules and components to be run inside lightweight Virtual Machine backed Sandbox. It is built on top of Hyperlight.
A Datacenter Scale Distributed Inference Serving Framework
SGLang is a fast serving framework for large language models and vision language models.
Kubernetes RBAC authorizing HTTP proxy for a single upstream.
Cost-efficient and pluggable Infrastructure components for GenAI inference
Health checks for Azure N- and H-series VMs.
Gateway API Inference Extension
Hyperlight is a lightweight Virtual Machine Manager (VMM) designed to be embedded within applications. It enables safe execution of untrusted code within micro virtual machines with very low latenc…
Agentic AI framework for enterprise workflow automation.
This Kubernetes fork is intended to provide long term support for Kubernetes releases, but is not an official release of the Kubernetes project. For more information, please see https://github.com/…
Basic Streamlit Application for testing, and displaying Multi-GPU LLM timings
A high-throughput and memory-efficient inference and serving engine for LLMs
Containerized Python based Framework for running and visualizing benchmark workloads on any Kubernetes/ OpenShift and runtime kinds pods, kata containers and kubevirt virtual machines simply and sa…
LeaderWorkerSet: An API for deploying a group of pods as a unit of replication
📦 Produce secure packages and containers with declarative configurations
InstaSlice facilitates the use of Dynamic Resource Allocation (DRA) on Kubernetes clusters for GPU sharing