- San Francisco Bay Area, CA
-
RTopK Public
Forked from xiexi51/RTopKOfficial Implementation of "RTop-K: Ultra-Fast Row-Wise Top-K Selection for Neural Network Acceleration on GPUs"
Cuda UpdatedApr 2, 2025 -
-
-
TransformerEngine Public
Forked from NVIDIA/TransformerEngineA library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper GPUs, to provide better performance with lower memory utilization in bot…
Cuda Apache License 2.0 UpdatedOct 6, 2022 -
-
rules_cuda Public
Forked from bazel-contrib/rules_cudaStarlark implementation of bazel rules for CUDA.
Starlark MIT License UpdatedJun 13, 2022 -
rules_cuda_examples Public
Forked from cloudhan/rules_cuda_examplesThis repo holds the extended examples for rules_cuda.
Starlark UpdatedJun 11, 2022 -
-
MinkowskiEngine Public
Forked from NVIDIA/MinkowskiEngineMinkowski Engine is an auto-diff neural network library for high-dimensional sparse tensors
Python Other UpdatedJan 12, 2022 -
data-parallel-CPP Public
Forked from Apress/data-parallel-CPPSource code for 'Data Parallel C++: Mastering DPC++ for Programming of Heterogeneous Systems using C++ and SYCL' by James Reinders, Ben Ashbaugh, James Brodman, Michael Kinsner, John Pennycook, Xin…
CMake Other UpdatedJan 11, 2022 -
spconv Public
Forked from traveller59/spconvSpatial Sparse Convolution Library
Python Apache License 2.0 UpdatedJan 5, 2022 -
direwolf Public
Forked from wb2osz/direwolfDire Wolf is a software "soundcard" AX.25 packet modem/TNC and APRS encoder/decoder. It can be used stand-alone to observe APRS traffic, as a tracker, digipeater, APRStt gateway, or Internet Gatewa…
C GNU General Public License v2.0 UpdatedJan 4, 2022 -
awesome-reMarkable Public
Forked from reHackable/awesome-reMarkableA curated list of projects related to the reMarkable tablet
Creative Commons Zero v1.0 Universal UpdatedApr 9, 2021 -
-
-
bazel-examples Public
Forked from OasisDigital/bazel-examplesExamples of Bazel use
Java UpdatedDec 1, 2020 -
-
-
raytracinginoneweekendincuda Public
Forked from rogerallen/raytracinginoneweekendincudaThe code for the ebook Ray Tracing in One Weekend by Peter Shirley translated to CUDA by Roger Allen. This work is in the public domain.
C++ UpdatedOct 12, 2020 -
-
Vitis_Accel_Examples Public
Forked from Xilinx/Vitis_Accel_ExamplesVitis_Accel_Examples
C++ Other UpdatedSep 15, 2020 -
torch2trt Public
Forked from NVIDIA-AI-IOT/torch2trtAn easy to use PyTorch to TensorRT converter
Python MIT License UpdatedSep 3, 2020 -
brevitas Public
Forked from Xilinx/brevitasBrevitas: quantization-aware training in Pytorch
Python Other UpdatedAug 11, 2020 -
nandland Public
Forked from nandland/nandlandAll code found on nandland is here. underconstruction.gif
Verilog UpdatedJul 19, 2020 -
TensorRT Public
Forked from NVIDIA/TensorRTTensorRT is a C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators.
C++ Apache License 2.0 UpdatedJul 14, 2020 -
open-gpu-doc Public
Forked from NVIDIA/open-gpu-docDocumentation of NVIDIA chip/hardware interfaces
-
Get_Moving_With_Alveo Public
Forked from Xilinx/Get_Moving_With_AlveoFor publishing the source for UG1352 "Get Moving with Alveo"
C++ UpdatedJun 17, 2020 -
rpi-gpio-dma-demo Public
Forked from hzeller/rpi-gpio-dma-demoPerformance writing to GPIO with CPU and DMA on the Raspberry Pi
C UpdatedMay 28, 2020 -
nvidia_libs_test Public
Forked from google/nvidia_libs_testTests and benchmarks for cudnn (and in the future, other nvidia libraries)
C++ Apache License 2.0 UpdatedMay 6, 2020 -
gradient-checkpointing Public
Forked from cybertronai/gradient-checkpointingMake huge neural nets fit in memory
Python MIT License UpdatedApr 26, 2020