-
Nanjing University
- Nanjing
- https://www.nju.edu.cn/
Highlights
- Pro
-
yolov5 Public
Forked from ultralytics/yolov5YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
Python GNU Affero General Public License v3.0 UpdatedApr 21, 2025 -
-
unsloth Public
Forked from unslothai/unslothFinetune Llama 3.3, DeepSeek-R1, Gemma 3 & Reasoning LLMs 2x faster with 70% less memory! 🦥
Python Apache License 2.0 UpdatedMar 31, 2025 -
-
leptonai Public
Forked from leptonai/leptonaiA Pythonic framework to simplify AI service building
Python Apache License 2.0 UpdatedMar 26, 2025 -
MICSim_V1.0 Public
Forked from MICSim-official/MICSim_V1.0Official code of paper "MICSim: A Modular Simulator for Mixed-signal Compute-in-Memory based AI Accelerator", ASP-DAC 2025
Python UpdatedMar 24, 2025 -
llm-awq Public
Forked from mit-han-lab/llm-awq[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Python MIT License UpdatedMar 18, 2025 -
Efficient-AI-Backbones Public
Forked from huawei-noah/Efficient-AI-BackbonesEfficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
Python UpdatedMar 15, 2025 -
Torch-Pruning Public
Forked from VainF/Torch-Pruning[CVPR 2023] Towards Any Structural Pruning; LLMs / SAM / Diffusion / Transformers / YOLOv8 / CNNs
Python MIT License UpdatedMar 14, 2025 -
OpenManus Public
Forked from FoundationAgents/OpenManusNo fortress, purely open ground. OpenManus is Coming.
Python MIT License UpdatedMar 11, 2025 -
omniserve Public
Forked from mit-han-lab/omniserve[MLSys'25] QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving; [MLSys'25] LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention
C++ Apache License 2.0 UpdatedMar 6, 2025 -
awesome Public
Forked from sindresorhus/awesome😎 Awesome lists about all kinds of interesting topics
Creative Commons Zero v1.0 Universal UpdatedMar 4, 2025 -
Awesome-Model-Quantization Public
Forked from Efficient-ML/Awesome-Model-QuantizationA list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (p…
UpdatedMar 4, 2025 -
tpu-mlir Public
Forked from sophgo/tpu-mlirMachine learning compiler based on MLIR for Sophgo TPU.
C++ Other UpdatedMar 3, 2025 -
fractalgen Public
Forked from LTH14/fractalgenPyTorch implementation of FractalGen https://arxiv.org/abs/2502.17437
Python MIT License UpdatedFeb 25, 2025 -
COAT Public
Forked from NVlabs/COAT[ICLR 2025] COAT: Compressing Optimizer States and Activation for Memory-Efficient FP8 Training
Python Apache License 2.0 UpdatedFeb 22, 2025 -
ASiM Public
Forked from Keio-CSG/ASiM[arXiv 2024] ACiM Inference Simulation Framework in "ASiM: Improving Transparency of SRAM-based Analog Compute-in-Memory Research with an Open-Source Simulation Framework"
Python Apache License 2.0 UpdatedFeb 20, 2025 -
-
autoqnn-pytorch Public
Forked from GongCheng1919/autoqnn-pytorchautoqnn for pytorch
Python MIT License UpdatedDec 31, 2024 -
-
tvm Public
Forked from apache/tvmOpen deep learning compiler stack for cpu, gpu and specialized accelerators
Python Apache License 2.0 UpdatedDec 17, 2024 -
Awesome-Quantization-Papers Public
Forked from Zhen-Dong/Awesome-Quantization-PapersList of papers related to neural network quantization in recent AI conferences and journals.
MIT License UpdatedDec 16, 2024 -
-
MIXQ Public
Forked from Qcompiler/MIXQMIXQ: Taming Dynamic Outliers in Mixed-Precision Quantization by Online Prediction
Python UpdatedOct 29, 2024 -
77F6 DNN_NeuroSim_V2.1 Public
Forked from neurosim/DNN_NeuroSim_V2.1Benchmark framework of compute-in-memory based accelerators for deep neural network (on-chip training chip focused)
-
DeepSpeed Public
Forked from deepspeedai/DeepSpeedDeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Python Apache License 2.0 UpdatedOct 12, 2024 -
xla Public
Forked from openxla/xlaA machine learning compiler for GPUs, CPUs, and ML accelerators
C++ Apache License 2.0 UpdatedOct 9, 2024 -
TensorRT Public
Forked from NVIDIA/TensorRTNVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
C++ Apache License 2.0 UpdatedOct 9, 2024 -
flashinfer Public
Forked from flashinfer-ai/flashinferFlashInfer: Kernel Library for LLM Serving
Cuda Apache License 2.0 UpdatedOct 1, 2024 -
MNN Public
Forked from alibaba/MNNMNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba
C++ UpdatedSep 27, 2024