8000 haohui (Haohui Mai) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View haohui's full-sized avatar

Highlights

  • Pro

Block or report haohui

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Optimized FP16/BF16 x FP4 GPU kernels for AMD GPUs

C++ 9 1 Updated Jul 3, 2025

a simple Flash Attention v2 implementation with ROCM (RDNA3 GPU, roc wmma), mainly used for stable diffusion(ComfyUI) in Windows ZLUDA environments.

Python 43 6 Updated Aug 25, 2024

collection of benchmarks to measure basic GPU capabilities

C++ 387 56 Updated Feb 11, 2025

Multiple NVIDIA GPUs or Apple Silicon for Large Language Model Inference?

Jupyter Notebook 1,681 66 Updated May 13, 2024

fastllm是后端无依赖的高性能大模型推理库。同时支持张量并行推理稠密模型和混合模式推理MOE模型,任意10G以上显卡即可推理满血DeepSeek。双路9004/9005服务器+单显卡部署DeepSeek满血满精度原版模型,单并发20tps;INT4量化模型单并发30tps,多并发可达60+。

C++ 3,731 379 Updated Jul 2, 2025

Interact with your documents using the power of GPT, 100% privately, no data leaks

Python 56,187 7,537 Updated Nov 13, 2024

A platform for building proxies to bypass network restrictions.

Go 46,156 8,915 Updated May 28, 2025

Open Hardware Monitor

C# 6,192 1,291 Updated Jul 13, 2024

OpenSource tool for monitoring, configuring and overclocking NVIDIA GPUs

C 2 Updated Feb 21, 2020

GPUVerify: a Verifier for GPU Kernels

C# 62 16 Updated Jul 28, 2022

An unofficial cuda assembler, for all generations of SASS, hopefully :)

Python 509 86 Updated Apr 20, 2023

《金庸群侠传》c++复刻版,已完工

C++ 2,748 388 Updated Jul 4, 2025

Radeon reverse engineering tools

Python 150 17 Updated Mar 29, 2020

Tools for people envious of nvidia's blob driver.

C 477 96 Updated Oct 26, 2023
C++ 9 Updated Aug 23, 2019

Assembler for NVIDIA Volta and Turing GPUs

Python 223 40 Updated Jan 13, 2022

Mythril is a symbolic-execution-based securty analysis tool for EVM bytecode. It detects security vulnerabilities in smart contracts built for Ethereum and other EVM-compatible blockchains.

Python 4,058 776 Updated Jun 9, 2025

SQL-based streaming analytics platform at scale

Java 1,225 286 Updated Jun 21, 2020

C++ library for zkSNARKs

C++ 1,880 591 Updated Jun 12, 2025

Official repository of the AWS EC2 FPGA Hardware and Software Development Kit

SystemVerilog 1,575 525 Updated Jul 1, 2025

Beringei is a high performance, in-memory storage engine for time series data.

C++ 3,167 293 Updated Jul 11, 2018

Equihash miner for NiceHash

C++ 769 579 Updated Dec 27, 2018

A curated list of Deep Learning hardware, cycle/memory optimisation techniques

41 14 Updated Aug 9, 2016

Firmware Analysis Tool

Rust 12,661 1,659 Updated Apr 14, 2025

The ExpressOS kernel

C# 16 5 Updated Jun 7, 2013

A pure front-end web UI for you-know-which bbs.

JavaScript 26 10 Updated Mar 7, 2016
0