Stars
Implementation connecting Arrow to Spark, effectively making all code related to reading in Spark redundant.
UCLA-VAST / tapa
Forked from rapidstream-org/rapidstream-tapaTAPA is a dataflow HLS framework that features fast compilation, expressive programming model and generates high-frequency FPGA accelerators.
Developing Spark External Data Sources using the V2 API
PyTorch Extension Library of Optimized Autograd Sparse Matrix Operations
PyTorch Memory Efficient Sparse Sparse Matrix Multiplication
Benchmarks of different devices I have come across
Easily benchmark PyTorch model FLOPs, latency, throughput, allocated gpu memory and energy consumption
Enabling PyTorch on XLA Devices (e.g. Google TPU)
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
Code repo for "An Empirical Evaluation of Columnar Storage Formats" VLDB Vol 17
A Profiler for Identifying the Major Sources of Performance Variance in Modern Applications
Code for paper "Engineering a High-Performance GPU B-Tree" accepted to PPoPP 2019
The Art of Latency Hiding in Modern Database Engines (VLDB 2024)
An in memory postgres DB instance for your unit tests
GitHub Repo for Aria: A Fast and Practical Deterministic OLTP Database