8000 sjchoi1 (Sangjin Choi) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View sjchoi1's full-sized avatar

Highlights

  • Pro

Block or report sjchoi1

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators

Python 62 6 Updated Jun 14, 2025

KernelBench: Can LLMs Write GPU Kernels? - Benchmark with Torch -> CUDA problems

Python 469 48 Updated Jul 10, 2025

NVIDIA Linux open GPU kernel module source

C 15,972 1,431 Updated Jul 7, 2025

CUDATracePreload is a dynamic tracing tool for CUDA and NCCL API calls.

C++ 3 Updated Dec 6, 2023
Python 18 Updated Nov 5, 2024

An interference-aware scheduler for fine-grained GPU sharing

Python 141 24 Updated Jan 26, 2025
C 24 6 Updated Aug 19, 2022

How much energy do GenAI models consume?

Python 45 5 Updated May 13, 2025

Online CUDA Occupancy Calculator

CoffeeScript 79 12 Updated Oct 12, 2021

Measure and optimize the energy consumption of your AI applications!

Python 273 34 Updated Jun 21, 2025

LineFS: Efficient SmartNIC Offload of a Distributed File System with Pipeline Parallelism

C 86 16 Updated Dec 24, 2021
Python 25 5 Updated Aug 31, 2023
JavaScript 11 Updated May 27, 2025

An unnecessarily tiny implementation of GPT-2 in NumPy.

Python 3,381 439 Updated Apr 24, 2023
Python 14 1 Updated Jan 24, 2023

Sniff CUDA ioctls

C 196 28 Updated May 4, 2023

Official code repository for "CoVA: Exploiting Compressed-Domain Analysis to Accelerate Video Analytics [USENIX ATC 22]"

Rust 16 3 Updated Sep 19, 2024

Cluster Far Mem, framework to execute single job and multi job experiments using fastswap

Python 21 11 Updated Jan 12, 2024

A recurrent (LSTM) neural network in C

C 94 15 Updated Jan 13, 2022

💻 macOS / Ubuntu dotfiles

Shell 1,494 298 Updated May 11, 2025

Example C++ CUDA implementation for training Neural Network on MNIST dataset

Jupyter Notebook 27 12 Updated Jun 15, 2018
0