8000 gswxp2 (Shiwei Gao) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View gswxp2's full-sized avatar

Highlights

  • Pro

Block or report gswxp2

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Expert Kit is an efficient foundation of Expert Parallelism (EP) for MoE model Inference on heterogenous hardware

Rust 33 8 Updated Jul 11, 2025

A low-latency, billion-scale, and updatable graph-based vector store on SSD.

Jupyter Notebook 42 11 Updated Jul 1, 2025

[DAC'25] Official implement of "HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference"

Python 58 3 Updated Jun 11, 2025

High-speed and easy-use LLM serving framework for local deployment

C++ 112 9 Updated Mar 18, 2025

Fast and memory-efficient exact attention

Python 18,315 1,803 Updated Jul 11, 2025

AutoMQ is a stateless/diskless Kafka on S3. 10x Cost-Effective. No Cross-AZ Traffic Cost. Autoscale in seconds. Single-digit ms latency. Multi-AZ Availability.

Java 6,777 465 Updated Jul 11, 2025

A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems

Python 183 9 Updated Oct 15, 2024

Next-generation datacenter OS built on kernel bypass to speed up unmodified code while improving platform density and security

C++ 101 14 Updated Jul 11, 2025

High-performance, scalable time-series database designed for Industrial IoT (IIoT) scenarios

C 24,090 4,934 Updated Jul 13, 2025

Tools and Reference Code for Intel Optimizations (eg Large Pages)

C 143 32 Updated Sep 20, 2024

Pacman: An Efficient Compaction Approach for Log-Structured Key-Value Store on Persistent Memory

C++ 44 11 Updated Dec 12, 2022

Sherman: A Write-Optimized Distributed B+Tree Index on Disaggregated Memory

C++ 110 31 Updated Oct 5, 2024

Dynamic (Temporal) Knowledge Graph Completion (Reasoning)

596 111 Updated Sep 15, 2020

Deep Learning Zero to All - Pytorch

Jupyter Notebook 1,203 1,384 Updated Nov 22, 2020

编译理论课作业(正则表达式与有穷自动机)辅助工具

Python 14 Updated Dec 7, 2022

The automation tower defense RTS

Java 24,311 3,157 Updated Jul 12, 2025
0