8000 bblueskydream (蓝天的梦) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View bblueskydream's full-sized avatar
🏠
Working from home
🏠
Working from home

Block or report bblueskydream

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

a small build system with a focus on speed

C++ 11,960 1,679 Updated May 15, 2025

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 14,220 1,005 Updated May 31, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 8,797 1,097 Updated May 31, 2025

High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.

Python 1,128 73 Updated May 29, 2025

Tensor library for machine learning

C++ 12,613 1,246 Updated May 31, 2025

AIInfra(AI 基础设施)指AI系统从底层芯片等硬件,到上层软件栈支持AI大模型训练和推理。

Python 2,793 362 Updated May 30, 2025

The Go programming language

Go 128,105 18,075 Updated May 30, 2025

Visual Studio Code

TypeScript 172,906 32,798 Updated May 31, 2025

CUDA 算子手撕与面试指南

Cuda 386 48 Updated Jan 15, 2025

Making large AI models cheaper, faster and more accessible

Python 40,920 4,519 Updated May 29, 2025

a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.

C++ 1,522 200 Updated Apr 7, 2025

Universal LLM Deployment Engine with ML Compilation

Python 20,722 1,741 Updated May 31, 2025

Large World Model -- Modeling Text and Video with Millions Context

Python 7,276 557 Updated Oct 19, 2024

A modern GUI client based on Tauri, designed to run in Windows, macOS and Linux for tailored proxy experience

TypeScript 59,885 4,584 Updated May 31, 2025

Large Context Attention

Python 711 53 Updated Jan 24, 2025

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

Python 4,642 379 Updated Apr 1, 2025

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 17,315 2,240 Updated Feb 1, 2025
Python 1,930 219 Updated May 29, 2025

Large Language Model (LLM) Systems Paper List

1,248 70 Updated May 23, 2025

[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding

Python 1,249 74 Updated Mar 6, 2025

Transformer related optimization, including BERT, GPT

C++ 6,175 904 Updated Mar 27, 2024

Dynamic Memory Management for Serving LLMs without PagedAttention

C 383 30 Updated May 30, 2025

[MLSys'25] QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving; [MLSys'25] LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

C++ 686 45 Updated Mar 6, 2025

每个人都能看懂的大模型知识分享,LLMs春/秋招大模型面试前必看,让你和面试官侃侃而谈

Jupyter Notebook 3,045 300 Updated May 17, 2025

A Datacenter Scale Distributed Inference Serving Framework

Rust 4,137 392 Updated May 31, 2025

FlashInfer: Kernel Library for LLM Serving

Cuda 3,081 318 Updated May 31, 2025

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Python 31,474 6,536 Updated Jan 9, 2025

Evolutionary Scale Modeling (esm): Pretrained language models for proteins

Python 3,626 695 Updated Feb 7, 2024

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 51,237 6,192 Updated May 31, 2025
Next
0