8000 AaronJing (Iceberg) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View AaronJing's full-sized avatar

Highlights

  • Pro

Block or report AaronJing

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Context7 MCP Server -- Up-to-date code documentation for LLMs and AI code editors

JavaScript 15,489 766 Updated Jun 26, 2025
C++ 20 4 Updated Feb 12, 2025

CUDA Matrix Multiplication Optimization

Cuda 196 21 Updated Jul 19, 2024

An efficient GPU support for LLM inference with x-bit quantization (e.g. FP6,FP5).

Cuda 252 20 Updated Oct 28, 2024

A stand-alone implementation of several NumPy dtype extensions used in machine learning.

C++ 277 41 Updated Jun 2, 2025

Proper implementation of ResNet-s for CIFAR10/100 in pytorch that matches description of the original paper.

Python 1,287 332 Updated Jun 18, 2024
Scala 30 2 Updated Nov 6, 2024

Findpapers: A tool for helping researchers who are looking for related works

Python 273 36 Updated Feb 5, 2024

FUE5 is a fan-made project with the goal to see what would Factorio look like and behave in 3D. This project has no affiliation with the official Factorio game.

1,851 64 Updated Jun 16, 2023

Library of approximate arithmetic circuits

Verilog 55 19 Updated Sep 8, 2022

Approximate layers - TensorFlow extension

C 27 12 Updated Apr 14, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 53,073 6,499 Updated Jun 27, 2025

A collection of research papers on efficient training of DNNs

70 8 Updated Jul 6, 2022

PyTorch emulation library for Microscaling (MX)-compatible data formats

Python 251 34 Updated Jun 18, 2025
Scala 4 Updated May 11, 2024

Grok open release

Python 50,288 8,354 Updated Aug 30, 2024

synthesiseable ieee 754 floating point library in verilog

Verilog 647 158 Updated Mar 13, 2023

Development repository for the Triton language and compiler

MLIR 15,969 2,071 Updated Jun 27, 2025

Implementation of Transformer Model in Tensorflow

Python 470 90 Updated Mar 25, 2023

A simple high performance CUDA GEMM implementation.

Cuda 382 42 Updated Jan 4, 2024

YoloV3 Implemented in Tensorflow 2.0

Jupyter Notebook 2,512 898 Updated Aug 30, 2024

yolov4 42.0% mAP.ppyolo 45.1% mAP.

Python 445 127 Updated Dec 17, 2020

tfyolo: Efficient Implementation of Yolov5 in TensorFlow

Python 233 72 Updated Apr 3, 2024

transformer in tensorflow 2.0

Jupyter Notebook 64 21 Updated Apr 30, 2021

This is a fast and concise implementation of Faster R-CNN with TensorFlow2.

Python 26 10 Updated Mar 21, 2023

Comp9444 - cv project

Jupyter Notebook 2 1 Updated Aug 2, 2022

📝 Some source code about matrix multiplication implementation on CUDA

Cuda 34 9 Updated Sep 12, 2018

Curated content for DNN approximation, acceleration ... with a focus on hardware accelerator and deployment

26 6 Updated May 15, 2024

Berkeley's Spatial Array Generator

Scala 978 201 Updated Apr 13, 2025
Next
0