-
tilelang Public
Forked from tile-ai/tilelangDomain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
C++ MIT License UpdatedMar 6, 2025 -
flash-attention-wmma Public
FlashAttention2 implementation with TensorCore WMMA API
-
chipcraft---mest-course Public
Forked from efabless/chipcraft---mest-courseTL-Verilog Creative Commons Zero v1.0 Universal UpdatedMar 15, 2024 -
-
ivy Public
Forked from ivy-llc/ivyThe Unified Machine Learning Framework
-
composer Public
Forked from mosaicml/composerTrain neural networks up to 7x faster
Python Apache License 2.0 UpdatedAug 3, 2023 -
-
xformers Public
Forked from facebookresearch/xformersHackable and optimized Transformers building blocks, supporting a composable construction.
Python Other UpdatedMar 30, 2023 -
openvino_notebooks Public
Forked from openvinotoolkit/openvino_notebooks📚 Jupyter notebook tutorials for OpenVINO™
Jupyter Notebook Apache License 2.0 UpdatedMar 22, 2023 -
tensorrt-examples Public
Forked from NobuoTsukamoto/tensorrt-examplesTensorRT Examples (TensorRT, Jetson Nano, Python, C++)
Jupyter Notebook MIT License UpdatedMar 12, 2023 -
onnx2pytorch Public
Forked from Talmaj/onnx2pytorchTransform ONNX model to PyTorch representation
Python Apache License 2.0 UpdatedJan 16, 2023 -
tkDNN Public
Forked from ceccocats/tkDNNDeep neural network library and toolkit to do high performace inference on NVIDIA jetson platforms
-
darknet Public
Forked from AlexeyAB/darknetYOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
-
-
-
-
TensorRT Public
Forked from NVIDIA/TensorRTTensorRT is a C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators.
C++ Apache License 2.0 UpdatedMay 14, 2022 -
SGEMM-Implementation-and-Optimization Public
Forked from Huanghongru/SGEMM-Implementation-and-Optimization📝 Some source code about matrix multiplication implementation on CUDA
Jupyter Notebook UpdatedApr 17, 2022 -
-
sharpened_cosine_similarity_torch Public
Forked from brohrer/sharpened-cosine-similarityA Sharpened Cosine Similarity layer for PyTorch
Python MIT License UpdatedFeb 20, 2022 -
-
-
-
-
-
-
-
-
-
producer-consumer-server Public
POSIX thread producer and consumer server, run make && ./pserver
C UpdatedFeb 16, 2021