10000 sarckk (Yong Hoon Shin) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View sarckk's full-sized avatar
🎯
🎯

Block or report sarckk

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Supercharge Your LLM with the Fastest KV Cache Layer

Python 2,521 301 Updated Jul 8, 2025

[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

Python 258 17 Updated Aug 31, 2024

Open source repo for Locate 3D Model, 3D-JEPA and Locate 3D Dataset

Python 328 24 Updated Jun 3, 2025

[ICLR 2025] Palu: Compressing KV-Cache with Low-Rank Projection

Python 124 6 Updated Feb 20, 2025

Dream 7B, a large diffusion language model

Python 813 39 Updated Jun 18, 2025

Unified KV Cache Compression Methods for Auto-Regressive Models

Python 1,184 150 Updated Jan 4, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 15,828 2,286 Updated Jul 8, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 51,758 8,570 Updated Jul 8, 2025

Dynamic Memory Management for Serving LLMs without PagedAttention

C 401 31 Updated May 30, 2025

Awesome LLM compression research papers and tools.

1,591 102 Updated Jul 2, 2025

[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer

Python 9,539 912 Updated Jul 6, 2025

📰 Must-read papers and blogs on Speculative Decoding ⚡️

821 44 Updated Jun 22, 2025

A library to analyze PyTorch traces.

Python 391 62 Updated Jun 23, 2025

A curated list for Efficient Large Language Models

Python 1,768 141 Updated Jun 17, 2025

An implementation of a deep learning recommendation model (DLRM)

Python 3,918 858 Updated May 30, 2025

New repo collection for NVIDIA Cosmos: https://github.com/nvidia-cosmos

8,036 514 Updated Jun 9, 2025

🖥️ Run AI Agent in your browser.

Python 14,019 2,406 Updated Jun 1, 2025

Writing an OS in 1,000 lines.

C 2,667 202 Updated Jun 15, 2025

Boids implementation in C++ with spatial hashing

C++ 11 2 Updated Jul 24, 2021

20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

Python 12,451 1,269 Updated Jul 7, 2025

SOTA Open Source TTS

Python 22,270 1,822 Updated Jul 2, 2025

A powerful framework for building realtime voice AI agents 🤖🎙️📹

Python 6,679 1,054 Updated Jul 8, 2025

An implementation of bucketMul LLM inference

Swift 220 10 Updated Jul 1, 2024

The automation tower defense RTS

Java 24,292 3,153 Updated Jul 8, 2025

Pytorch domain library for recommendation systems

Python 2,260 535 Updated Jul 8, 2025

Less than 100 Kilobytes. Works for Android 5.1 and above

C 2,306 146 Updated Dec 27, 2024

Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚

Python 28,906 1,832 Updated Mar 21, 2025

The official NGINX Open Source repository.

C 27,381 7,412 Updated Jul 3, 2025

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 42,654 7,136 Updated Dec 9, 2024

notes for software engineers getting up to speed on new AI developments. Serves as datastore for https://latent.space writing, and product brainstorming, but has cleaned up canonical references und…

HTML 5,870 502 Updated Jun 27, 2025
Next
0