MaoZiming

🔭

Ziming Mao MaoZiming

🔭

PhD student @ UC Berkeley at @ucbrise and @NetSys, CS @ Yale, @Yale-LILY, @Thesys-lab @ CMU. Prev. @databricks

118 followers · 53 following

UC Berkeley
Berkeley, CA
21:29 (UTC -07:00)
https://maoziming.github.io/
@ziming_mao
in/maoziming

Organizations

Stars

microsoft / msccl

Microsoft Collective Communication Library

C++ 351 32 Updated Sep 20, 2023

deepseek-ai / profile-data

Analyze computation-communication overlap in V3/R1.

1,076 144 Updated Mar 21, 2025

zartbot / shallowsim

DeepSeek-V3/R1 inference performance simulator

Jupyter Notebook 154 21 Updated Mar 27, 2025

celerity / ndzip

A High-Throughput Parallel Lossless Compressor for Scientific Data

C++ 70 14 Updated Jan 22, 2023

deepseek-ai / EPLB

Expert Parallelism Load Balancer

Python 1,228 195 Updated Mar 24, 2025

llm-d / llm-d

llm-d is a Kubernetes-native high-performance distributed LLM inference framework

Makefile 1,322 105 Updated Jun 25, 2025

NVIDIA / gdrcopy

A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology

C++ 1,151 162 Updated Jun 5, 2025

openucx / ucc

Unified Collective Communication Library

C 259 112 Updated Jul 8, 2025

LMCache / LMCache

Supercharge Your LLM with the Fastest KV Cache Layer

Python 2,531 301 Updated Jul 8, 2025

openucx / ucx

Unified Communication X (mailing list - https://elist.ornl.gov/mailman/listinfo/ucx-group)

C 1,387 472 Updated Jun 30, 2025

pytorch-labs / monarch

PyTorch Single Controller

Rust 303 47 Updated Jul 9, 2025

ai-dynamo / nixl

NVIDIA Inference Xfer Library (NIXL)

C++ 451 109 Updated Jul 9, 2025

GPU implementation of a fast generalized ANS (asymmetric numeral system) entropy encoder and decoder, with extensions for lossless compression of numerical and other data types in HPC/ML applications.

Cuda 341 28 Updated Jun 18, 2025

aws / aws-ofi-nccl

This is a plugin which lets EC2 developers use libfabric as network provider while running NCCL applications.

C++ 177 69 Updated Jul 9, 2025

gpudirect / libibverbs

Mellanox libibverbs

C++ 70 14 Updated Aug 28, 2019

microsoft / machnet

Machnet provides applications like databases and finance an easy way to access low-latency DPDK-based messaging on public cloud VMs. 750K RPS on Azure at 61 us P99.9.

C++ 120 22 Updated Jan 28, 2025