8000 skykongkong8 (Sungsik Kong) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View skykongkong8's full-sized avatar

Organizations

@kucc @nnstreamer @Guerilla-Coders

Block or report skykongkong8

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

BS::thread_pool: a fast, lightweight, modern, and easy-to-use C++17 / C++20 / C++23 thread pool library

C++ 2,590 285 Updated Dec 20, 2024

MatMul Performance Benchmarks for a Single CPU Core comparing both hand engineered and codegen kernels.

C++ 133 31 Updated Sep 25, 2023

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 52,079 8,662 Updated Jul 13, 2025

MACE is a deep learning inference framework optimized for mobile heterogeneous computing platforms.

C++ 5,012 825 Updated Jun 17, 2024

Fast Multimodal LLM on Mobile Devices

C++ 949 115 Updated Jun 14, 2025

A curated list of Large Language Model resources, covering model training, serving, fine-tuning, and building LLM applications.

3,475 445 Updated May 23, 2025

Official inference framework for 1-bit LLMs

Python 20,500 1,536 Updated Jun 3, 2025

Inference Llama 2 in one file of pure C

C 18,548 2,295 Updated Aug 6, 2024

Main gperftools repository

C++ 8,750 1,524 Updated Jun 5, 2025

Tensor library for machine learning

C++ 12,814 1,282 Updated Jul 12, 2025

LLM inference in C/C++

C++ 82,932 12,320 Updated Jul 12, 2025

An open-source RAG-based tool for chatting with your documents.

Python 22,782 1,832 Updated Jul 4, 2025
Jupyter Notebook 37 30 Updated May 13, 2025

The HPC toolbox: fused matrix multiplication, convolution, data-parallel strided tensor primitives, OpenMP facilities, SIMD, JIT Assembler, CPU detection, state-of-the-art vectorized BLAS for float…

Nim 286 14 Updated Jan 4, 2024

INACTIVE - http://mzl.la/ghe-archive - FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/

C++ 3 Updated Mar 18, 2020

Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators

C++ 437 209 Updated Jul 11, 2025

Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms.

C++ 700 148 Updated Oct 18, 2023

OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.

C 6,859 1,572 Updated Jul 11, 2025

TinyChatEngine: On-Device LLM Inference Library

C++ 871 89 Updated Jul 4, 2024

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…

C++ 10,994 1,578 Updated Jul 12, 2025

LLM training in simple, raw C/CUDA

Cuda 27,129 3,121 Updated Jun 26, 2025

21 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/

Jupyter Notebook 91,728 46,942 Updated Jul 7, 2025

Universal Manipulation Interface: In-The-Wild Robot Teaching Without In-The-Wild Robots

Python 941 180 Updated Dec 18, 2024

row-major matmul optimization

C++ 646 89 Updated Sep 9, 2023

Accessible large language models via k-bit quantization for PyTorch.

Python 7,215 720 Updated Jul 8, 2025

Low-precision matrix multiplication

C++ 1,810 456 Updated Jan 29, 2024

Encapsulate the frequently used AVX instructions as independent modules to reduce repeated development workload.

C 123 45 Updated Jan 13, 2024

High-efficiency floating-point neural network inference operators for mobile, server, and Web

C 2,063 431 Updated Jul 11, 2025
Next
0