junrushao

Junru Shao junrushao

MLSys, DL compiler

809 followers · 1.2k following

California
https://linktr.ee/junrushao

Achievements

x3 x2

Achievements

x3 x2

Organizations

Lists (3)

Sort

🔮 Future ideas

✨ Inspiration

🚀 My stack

Stars

mlc-ai / mlc-python

C++ 35 7 Updated May 25, 2025

tile-ai / tilelang

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 1,300 103 Updated Jun 19, 2025

lepture / shibuya

A responsive, good looking with modern design documentation theme for Sphinx, with great supports for many sphinx extensions.

CSS 228 11 Updated Jun 11, 2025

HazyResearch / ThunderKittens

Tile primitives for speedy kernels

Cuda 2,461 157 Updated Jun 18, 2025

TheLartians / ModernCppStarter

🚀 Kick-start your C++! A template for modern C++ projects using CMake, CI, code coverage, clang-format, reproducible dependency management and much more.

CMake 4,873 425 Updated Mar 12, 2025

mlc-ai / web-llm-assistant

AI Assistant running within your browser.

TypeScript 68 14 Updated Dec 3, 2024

KnowingNothing / MatmulTutorial

A Easy-to-understand TensorOp Matmul Tutorial

C++ 364 47 Updated Sep 21, 2024

nox-410 / tvm.tl

An extention of TVMScript to write simple and high performance GPU kernels with tensorcore.

Python 50 2 Updated Jul 23, 2024

Cornell-RelaxML / quip-sharp

Python 541 45 Updated Oct 29, 2024

DatCaptainHorse / text-generation-webui-mlc-llm

Forked from oobabooga/text-generation-webui

MLC-LLM fork of oobabooga/text-generation-webui

Python 1 Updated Oct 17, 2023

hrishioa / wasm-ai

Vercel and web-llm template to run wasm models directly in the browser.

TypeScript 152 21 Updated Nov 21, 2023

OpenGVLab / OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

Python 818 64 Updated May 22, 2025

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 3,211 343 Updated Jun 19, 2025

pacman100 / mlc-llm

Forked from mlc-ai/mlc-llm

Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.

Python 11 1 Updated Aug 19, 2023

gsuuon / model.nvim

Neovim plugin for interacting with LLM's and building editor integrated prompts.

Lua 375 27 Updated Jun 8, 2025

huggingface / llm.nvim

LLM powered development for Neovim

Lua 1,029 56 Updated Jan 9, 2025

punica-ai / punica

Serving multiple LoRA finetuned LLM as one

Python 1,066 52 Updated May 8, 2024

RELOAD22 / tvm

Python 7 Updated Sep 13, 2023

intel / xetla

C++ 62 20 Updated Dec 18, 2024

Chrisz236 / llm-rk3588

Run Large Language Models on RK3588 with GPU-acceleration

105 5 Updated Aug 16, 2023

gsuuon / ad-llama

Structured inference with Llama 2 in your browser

TypeScript 52 2 Updated Nov 1, 2024

mlc-ai / docs

The documents for TVM Unity

Shell 8 2 Updated Aug 9, 2024

gsuuon / ad-llama-demo

ad-llama demo with Vite

HTML 4 Updated Aug 3, 2023

XinyuSun / mlc-chatbot

python interface for mlc chat cli

Python 15 1 Updated May 7, 2023

gorbit99 / codewindow.nvim

Lua 462 18 Updated May 22, 2025

yutkat / my-neovim-pluginlist

My personal list of Neovim plugins

HTML 754 9 Updated Jun 19, 2025

luukvbaal / statuscol.nvim

Status column plugin that provides a configurable 'statuscolumn' and click handlers.

Lua 578 27 Updated Jun 2, 2025

cblmemo / tvm-async-rule-benchmark

Cuda 3 Updated Feb 17, 2023

mmcloughlin / cuptisamples

NVIDIA CUPTI samples mirror.

Shell 7 3 Updated Jun 7, 2025

filipdutescu / modern-cpp-template

A template for modern C++ projects using CMake, Clang-Format, CI, unit testing and more, with support for downstream inclusion.

CMake 1,820 221 Updated Mar 16, 2024