-
University of Utah
- Salt Lake City
Lists (2)
Sort Name ascending (A-Z)
Starred repositories
[ICML 2025 Spotlight] ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference
QUDA is a library for performing calculations in lattice QCD on GPUs.
AOMP is an open source Clang/LLVM based compiler with added support for the OpenMP® API on Radeon™ GPUs. Use this repository for releases, issues, documentation, packaging, and examples.
LLVM/MLIR based compiler instrumentation of AMD GPU kernels
Advanced Profiling and Analytics for AMD Hardware
FlashMLA: Efficient MLA decoding kernels
A retargetable MLIR-based machine learning compiler and runtime toolkit.
HIP: C++ Heterogeneous-Compute Interface for Portability
C++ Insights - See your source code with the eyes of a compiler
Trio – a friendly Python library for async concurrency and I/O
A lightweight library for portable low-level GPU computation using WebGPU.
Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction
The Tensor Algebra Compiler (taco) computes sparse tensor expressions on CPUs and GPUs
ROCm Platform Runtime: ROCr a HPC market enhanced HSA based runtime
A Python package for extending the official PyTorch that can easily obtain performance on Intel platform