Stars
Multi-GPU dynamic scheduler using PGAS style cross-GPU communication
QUDA is a library for performing calculations in lattice QCD on GPUs.
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
Papers from the computer science community to read and discuss.
A General-purpose Task-parallel Programming System using Modern C++
Examples demonstrating available options to program multiple GPUs in a single node or a cluster
Efficient and High-quality Graph Coloring on the GPU
Source code examples from the Parallel Forall Blog
A Python implementation of global optimization with gaussian processes.
Models and examples built with TensorFlow
Depend on Scala modules like a pro