8000 smpanaro (Stephen Panaro) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View smpanaro's full-sized avatar

Block or report smpanaro

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 2 Updated May 16, 2025

Modified K-means Algorithm with Local Optimality Guarantees (ICML 2025)

C++ 2 Updated Jun 16, 2025

Official PyTorch implementation of "GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance" (ICML 2025)

Python 30 Updated Jun 19, 2025

Parse and disassemble .metallib files in browser

JavaScript 40 5 Updated Jul 24, 2023
Python 61 4 Updated Jun 17, 2025

Code implementation of GPTAQ (https://arxiv.org/abs/2504.02692)

Python 47 Updated Jun 1, 2025

A Python package for optimal 1D k-means clustering.

C++ 52 8 Updated Jan 1, 2025

Code repository for ICLR 2025 paper "LeanQuant: Accurate and Scalable Large Language Model Quantization with Loss-error-aware Grid"

Python 15 1 Updated Mar 2, 2025

Official Implementation of "KBLaM: Knowledge Base augmented Language Model"

Jupyter Notebook 1,324 112 Updated Apr 29, 2025

Local Deep Research is an AI-powered assistant that transforms complex questions into comprehensive, cited reports by conducting iterative analysis using any LLM across diverse knowledge sources in…

Python 2,889 287 Updated Jun 22, 2025

[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".

Python 490 56 Updated Jun 17, 2025

Production ready LLM model compression/quantization toolkit with hw accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.

Python 628 < 8000 /path> 92 Updated Jun 19, 2025

A machine learning software for extracting information from scholarly documents

Java 4,137 489 Updated Jun 20, 2025

A collection of app themes based on some Nostromo UI from Alien.

193 14 Updated Dec 28, 2024

Solve Puzzles. Learn Metal 🤘

Jupyter Notebook 561 28 Updated Sep 24, 2024

Mini-V is a compact core-xy printer with a build volume of 180mm³ using 2020 extrusions. Inspired to be a mini-Voron.

33 6 Updated Dec 26, 2024
Rust 14 1 Updated Dec 4, 2024

Virtual whiteboard for sketching hand-drawn like diagrams

TypeScript 102,297 10,094 Updated Jun 21, 2025

Entropy Based Sampling and Parallel CoT Decoding

Python 3,387 325 Updated Nov 13, 2024

A monospaced pixel font with a lo-fi, techy vibe

TypeScript 1,534 15 Updated May 25, 2025

A pythonic generic language server

Python 663 115 Updated Jun 20, 2025

Exploring the scalable matrix extension of the Apple M4 processor

C 178 9 Updated Nov 7, 2024

[ACL 2024] A novel QAT with Self-Distillation framework to enhance ultra low-bit LLMs.

Python 115 16 Updated May 16, 2024

[EMNLP 2024] RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization

Python 37 2 Updated Sep 24, 2024

Code execution exploit for Tony Hawk's video game series

Assembly 335 14 Updated Feb 24, 2025
JavaScript 1 Updated Oct 29, 2024

The homepage of OneBit model quantization framework.

Python 181 4 Updated Feb 5, 2025

Code repo for the paper "SpinQuant LLM quantization with learned rotations"

Python 288 43 Updated Feb 14, 2025

A collection of tricks and tools to speed up transformer models

TeX 167 10 Updated Jun 3, 2025
Next
0