8000 wuyongwei (Mick) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View wuyongwei's full-sized avatar

Block or report wuyongwei

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

"LightRAG: Simple and Fast Retrieval-Augmented Generation"

Python 16,681 2,289 Updated May 23, 2025

Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.

Python 6,062 593 Updated May 24, 2025

Using GPT to organize and access information, and generate questions. Long term goal is to make an agent-like research assistant.

Jupyter Notebook 689 54 Updated Dec 20, 2023

Wan: Open and Advanced Large-Scale Video Generative Models

Python 11,659 1,332 Updated May 17, 2025

ZeroSearch: Incentivize the Search Capability of LLMs without Searching

Python 890 82 Updated May 18, 2025

Redis for LLMs

Python 1,166 170 Updated May 24, 2025

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 10,618 761 Updated May 15, 2025
CSS 76 4 Updated Apr 3, 2025

A Datacenter Scale Distributed Inference Serving Framework

Rust 4,093 376 Updated May 24, 2025
Python 65 6 Updated Apr 2, 2025

Easy-to-use and powerful LLM and SLM library with awesome model zoo.

Python 12,603 3,032 Updated May 23, 2025

Open-Sora: Democratizing Efficient Video Production for All

Python 26,519 2,572 Updated Apr 30, 2025

A fast communication-overlapping library for tensor/expert parallelism on GPUs.

C++ 947 60 Updated Apr 15, 2025

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 8,924 886 Updated May 21, 2025

Expert Parallelism Load Balancer

Python 1,197 191 Updated Mar 24, 2025

Analyze computation-communication overlap in V3/R1.

1,040 142 Updated Mar 21, 2025

A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.

Python 2,782 295 Updated Mar 10, 2025

DeepEP: an efficient expert-parallel communication library

Cuda 7,693 773 Updated May 23, 2025

FlashMLA: Efficient MLA decoding kernels

Cuda 11,571 836 Updated Apr 29, 2025

Muon is Scalable for LLM Training

1,048 48 Updated Mar 28, 2025

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…

C++ 10,555 1,448 Updated May 25, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,779 277 Updated May 15, 2025

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 3,243 256 Updated May 25, 2025

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 14,145 1,004 Updated May 23, 2025

The Triton TensorRT-LLM Backend

Shell 840 122 Updated May 21, 2025

FlashInfer: Kernel Library for LLM Serving

Cuda 3,030 314 Updated May 24, 2025

AIInfra(AI 基础设施)指AI系统从底层芯片等硬件,到上层软件栈支持AI大模型训练和推理。

Python 2,696 357 Updated May 23, 2025

AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Jupyter Notebook 13,679 1,963 Updated May 25, 2025
Next
0