8000 zifeitong / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View zifeitong's full-sized avatar

Block or report zifeitong

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Guidelines Support Library

C++ 6,419 750 Updated May 22, 2025

Asynchronous gRPC with Asio/unified executors

C++ 413 37 Updated May 25, 2025

An Extensible Deep Learning Library

Python 2,058 329 Updated May 28, 2025

Applied AI experiments and examples for PyTorch

Python 270 28 Updated May 27, 2025

Multi-GPU CUDA stress test

C++ 1,705 324 Updated Aug 20, 2024

Where GPUs get cooked 👩‍🍳🔥

Rust 230 12 Updated Mar 4, 2025
Python 115 15 Updated May 21, 2025

📚A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, FlashAttention, PagedAttention, Parallelism, MLA, etc.

Python 4,057 279 Updated May 27, 2025

NanoGPT (124M) in 3 minutes

Python 2,592 310 Updated May 27, 2025

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

C++ 775 63 Updated May 14, 2025

🤗 smolagents: a barebones library for agents that think in code.

Python 19,342 1,677 Updated May 28, 2025

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 1,210 93 Updated May 28, 2025

Efficient Triton Kernels for LLM Training

Python 5,104 335 Updated May 27, 2025

Optimizing inference proxy for LLMs

Python 2,417 178 Updated May 27, 2025

Eclipse iceoryx2™ - true zero-copy inter-process-communication in pure Rust

Rust 1,496 64 Updated May 28, 2025

A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology

C++ 1,106 159 Updated Mar 26, 2025

Protected Auction Key/Value Service

C++ 62 25 Updated May 23, 2025

The MLscript programming language. Functional and object-oriented; structurally typed and sound; with powerful type inference. Soon to have full interop with TypeScript!

Scala 191 32 Updated May 27, 2025

PROPELLER: Profile Guided Optimizing Large Scale LLVM-based Relinker

C++ 416 42 Updated May 28, 2025

Felafax is building AI infra for non-NVIDIA GPUs

Jupyter Notebook 561 35 Updated Jan 24, 2025

Sparse nonlinear least squares in JAX

Python 207 13 Updated May 26, 2025

Any model. Any hardware. Zero compromise. Built with @ziglang / @openxla / MLIR / @bazelbuild

Zig 2,280 80 Updated May 28, 2025

Experimentation using the xla compiler from rust

Rust 93 16 Updated Aug 17, 2024

Efficient and easy multi-instance LLM serving

Python 420 32 Updated May 28, 2025

A JAX research toolkit for building, editing, and visualizing neural networks.

Python 1,780 63 Updated Apr 26, 2025

Examples demonstrating available options to program multiple GPUs in a single node or a cluster

Cuda 714 125 Updated Feb 21, 2025

BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.

Python 618 49 Updated May 5, 2025

Meaningful control of data in distributed systems.

Rust 1,362 121 Updated May 28, 2025

Riegeli/records is a file format for storing a sequence of string records, typically serialized protocol buffers.

C++ 439 54 Updated May 28, 2025

Deep learning for dummies. All the practical details and useful utilities that go into working with real models.

Python 792 40 Updated Apr 30, 2025
Next
0