Lists (1)
Sort Name ascending (A-Z)
Stars
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
Protein structure diffusion model for unconditional protein generation and motif scaffolding
Code for the ProteinMPNN paper
A bunch of kernels that might make stuff slower 😉
Minimal and annotated implementations of key ideas from modern deep learning research.
Efficient 3D molecular generation with flow-matching and Semla
Source-to-Source Debuggable Derivatives in Pure Python
Experiment of using Tangent to autodiff triton
A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.
FlagGems is an operator library for large language models implemented in the Triton Language.
Denoising Hamiltonian Network for Physical Reasoning
Matplotlib styles for scientific plotting
Implementation of Flash Attention in Jax
Fast, differentiable sorting and ranking in PyTorch
TritonParse is a tool designed to help developers analyze and debug Triton kernels by visualizing the compilation process and source code mappings.
A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.
Python solutions to select problems from Computational Physics by Jos Thijssen
A batteries-included toolkit for the GPU-accelerated OpenMM molecular simulation engine.
Python package for solving multistage stochastic programming problems. Capable in handling general convex and hierarchical problems as well as parallel processing (via multithreading).
Anonymous Github is a proxy server to support anonymous browsing of Github repositories for open-science code and data.
A high-throughput and memory-efficient inference and serving engine for LLMs
[CVPR'25] Official Implementation of MambaIC: State Space Models for High-Performance Learned Image Compression
A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API