8000 wuyongwei (Mick) / Starred · GitHub

More Web Proxy on the site http://driver.im/

wuyongwei

Follow

Mick wuyongwei

Follow

10 followers · 18 following

Stars

HKUDS / LightRAG

"LightRAG: Simple and Fast Retrieval-Augmented Generation"

Python 16,681 2,289 Updated May 23, 2025

zilliztech / deep-searcher

Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.

Python 6,062 593 Updated May 24, 2025

tomhartke / knowledge-graph-from-GPT

Using GPT to organize and access information, and generate questions. Long term goal is to make an agent-like research assistant.

Jupyter Notebook 689 54 Updated Dec 20, 2023

Wan-Video / Wan2.1

Wan: Open and Advanced Large-Scale Video Generative Models

Python 11,659 1,332 Updated May 17, 2025

Alibaba-NLP / ZeroSearch

ZeroSearch: Incentivize the Search Capability of LLMs without Searching

Python 890 82 Updated May 18, 2025

LMCache / LMCache

Redis for LLMs

Python 1,166 170 Updated May 24, 2025

ByteDance-Seed / Seed-Thinking-v1.5

766 13 Updated Apr 20, 2025

QwenLM / Qwen2.5-VL

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 10,618 761 Updated May 15, 2025

cloud-fs / cloud-fs.github.io

1,103 42 Updated May 23, 2025

Tencent / llm.hunyuan.T1

CSS 76 4 Updated Apr 3, 2025

ai-dynamo / dynamo

A Datacenter Scale Distributed Inference Serving Framework

Rust 4,093 376 Updated May 24, 2025

shenh10 / DeepSeek_Simulator

Python 65 6 Updated Apr 2, 2025

PaddlePaddle / PaddleNLP

Easy-to-use and powerful LLM and SLM library with awesome model zoo.

Python 12,603 3,032 Updated May 23, 2025

hpcaitech / Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Python 26,519 2,572 Updated Apr 30, 2025

bytedance / flux

A fast communication-overlapping library for tensor/expert parallelism on GPUs.

C++ 947 60 Updated Apr 15, 2025

deepseek-ai / 3FS

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 8,924 886 Updated May 21, 2025

A285 deepseek-ai / EPLB

Expert Parallelism Load Balancer

Python 1,197 191 Updated Mar 24, 2025

deepseek-ai / profile-data

Analyze computation-communication overlap in V3/R1.

1,040 142 Updated Mar 21, 2025

deepseek-ai / DualPipe

A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.

Python 2,782 295 Updated Mar 10, 2025

deepseek-ai / DeepEP

DeepEP: an efficient expert-parallel communication library

Cuda 7,693 773 Updated May 23, 2025

deepseek-ai / FlashMLA

FlashMLA: Efficient MLA decoding kernels

Cuda 11,571 836 Updated Apr 29, 2025

MoonshotAI / Moonlight

Muon is Scalable for LLM Training

1,048 48 Updated Mar 28, 2025

NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…

C++ 10,555 1,448 Updated May 25, 2025

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,779 277 Updated May 15, 2025

ModelTC / lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 3,243 256 Updated May 25, 2025

kvcache-ai / ktransformers

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 14,145 1,004 Updated May 23, 2025

triton-inference-server / tensorrtllm_backend

The Triton TensorRT-LLM Backend

Shell 840 122 Updated May 21, 2025

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 3,030 314 Updated May 24, 2025

chenzomi12 / AIInfra

AIInfra（AI 基础设施）指AI系统从底层芯片等硬件，到上层软件栈支持AI大模型训练和推理。

Python 2,696 357 Updated May 23, 2025

chenzomi12 / AISystem

AISystem 主要是指AI系统，包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Jupyter Notebook 13,679 1,963 Updated May 25, 2025

0