Stars
KernelBench: Can LLMs Write GPU Kernels? - Benchmark with Torch -> CUDA problems
Benchmark and evaluation of AI workloads on edge devices
🎬 3.7× faster video generation E2E 🖼️ 1.6× faster image generation E2E ⚡ ColumnSparseAttn 9.3× vs FlashAttn‑3 💨 ColumnSparseGEMM 2.5× vs cuBLAS
Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research
A generative world for general-purpose robotics & embodied AI learning.
Training DIAMOND to play MarioKart64 in a Neural Network.
page: https://andreslavescu.github.io/SCI206-Project/
Write a fast kernel and run it on Discord. See how you compare against the best!
Optimizing CUDA kernels using a reinforcement learning approach
Self Driving Car development tools and technologies from GTA Robotics Community members
Tensors and Dynamic neural networks in Python with strong GPU acceleration
A lightweight library for portable low-level GPU computation using WebGPU.
AndreSlavescu / Liger-Kernel
Forked from linkedin/Liger-KernelEfficient Triton Kernels for LLM Training
Efficient Triton Kernels for LLM Training
AndreSlavescu / InstantSplat
Forked from NVlabs/InstantSplatInstantSplat: Sparse-view SfM-free Gaussian Splatting in Seconds
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Supercharge Your LLM Application Evaluations 🚀
AndreSlavescu / ragas
Forked from explodinggradients/ragasEvaluation framework for your Retrieval Augmented Generation (RAG) pipelines
Language Modeling with the H3 State Space Model
A polyhedral compiler for expressing fast and portable data parallel algorithms