8000 guopeng-gpli (guopeng li) / Starred · GitHub

More Web Proxy on the site http://driver.im/

guopeng-gpli

Follow

guopeng li guopeng-gpli

Follow

Computer Science PhD Student at University of Science and Technology of China(USTC)

15 followers · 136 following

Highlights

Pro

Stars

menik1126 / ParallelComp

[ICML 2025🔥] ParallelComp: Parallel Long-Context Compressor for Length Extrapolation

Python 12 Updated Jun 16, 2025

cactus-compute / cactus

Framework for running AI locally on mobile devices and wearables. Hardware-aware C/C++ backend with wrappers for Flutter & React Native. Kotlin & Swift coming soon.

C++ 933 53 Updated Jun 17, 2025

deepseek-ai / 3FS

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 9,024 897 Updated Jun 17, 2025

kvcache-ai / ktransformers

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 14,420 1,027 Updated Jun 15, 2025

deepseek-ai / FlashMLA

FlashMLA: Efficient MLA decoding kernels

Cuda 11,612 867 Updated Apr 29, 2025

sir-lab / data-release

Huawei Cloud datasets

Jupyter Notebook 72 11 Updated Apr 15, 2025

tensorzero / tensorzero

TensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM gateway, observability, optimization, evaluation, and experimentation.

Rust 7,636 456 Updated Jun 19, 2025

thustorage / Medusa

Medusa: Accelerating Serverless LLM Inference with Materialization [ASPLOS'25]

HTML 23 3 Updated May 13, 2025

paclopes / HungarianGPU

An GPU/CUDA implementation of the Hungarian algorithm

Cuda 111 19 Updated Apr 12, 2019

LMCache / LMCache

Redis for LLMs

Python 1,447 231 Updated Jun 19, 2025

tleers / serverless-llm-app-factory

Beginner-friendly serverless LLM deployment with Replicate & fly.io

Python 13 2 Updated Sep 3, 2023

ubc-cirrus-lab / caribou

Caribou is a framework for geo-distributed deployment of serverless workflows to save carbon emissions.

Python 8 3 Updated Jun 3, 2025

cgdsss / thesis_proposal_ustc

ustc thesis proposal 中国科学技术大学开题报告 latex 模板

TeX 23 3 Updated Dec 26, 2019

uw-mad-dash / bagpipe

Code for reproducing results for SOSP paper Bagpipe

Python 9 3 Updated Oct 20, 2023

AlibabaPAI / llumnix

Efficient and easy multi-instance LLM serving

Python 435 34 Updated Jun 19, 2025

xlite-dev / Awesome-LLM-Inference

📚A curated list of Awesome LLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.

Python 4,128 286 Updated Jun 18, 2025

ShishirPatil / gorilla

Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)

Python 12,165 1,151 Updated Jun 19, 2025

taishan1994 / pytorch_bert_intent_classification_and_slot_filling

基于pytorch的中文意图识别和槽位填充

Python 178 27 Updated Jul 3, 2024

stevelaskaridis / awesome-mobile-llm

Awesome Mobile LLMs

204 12 Updated Jun 2, 2025

langgenius / dify

Production-ready platform for agentic workflow development.

TypeScript 103,884 15,622 Updated Jun 19, 2025

Linear95 / bert-intent-slot-detector

BERT-based intent and slots detector for chatbots.

Python 191 29 Updated Feb 21, 2025

HPMLL / BurstGPT

A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems

Python 177 9 Updated Oct 15, 2024

tiingweii-shii / Awesome-Resource-Efficient-LLM-Papers

a curated list of high-quality papers on resource-efficient LLMs 🌱

125 7 Updated Mar 15, 2025

ServerlessLLM / ServerlessLLM

Serverless LLM Serving for Everyone.

Python 490 48 Updated Jun 18, 2025

pentium3 / sys_reading

system paper reading notes

246 12 Updated Mar 3, 2022

AmberLJC / LLMSys-PaperList

Large Language Model (LLM) Systems Paper List

1,311 72 Updated Jun 19, 2025

horseee / Awesome-Efficient-LLM

A curated list for Efficient Large Language Models

Python 1,736 136 Updated Jun 17, 2025

MicrosoftDocs / semantic-kernel-docs

Semantic Kernel (SK) is a lightweight SDK enabling integration of AI Large Language Models (LLMs) with conventional programming languages.

Mermaid 221 128 Updated Jun 11, 2025

togettoyou / hub-mirror

🚀 Docker 镜像代理，通过 GitHub Actions 将 docker.io、gcr.io、registry.k8s.io、k8s.gcr.io、quay.io、ghcr.io 等国外镜像转换为国内镜像加速下载

Go 1,100 689 Updated Feb 25, 2025

yuanmu97 / secure-transformer-inference

Secure Transformer Inference is a protocol for serving Transformer-based models securely.

Python 93 22 Updated May 8, 2024

0