8000 tonylt (tonylt) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View tonylt's full-sized avatar

Block or report tonylt

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Suna - Open Source Generalist AI Agent

TypeScript 14,367 2,138 Updated Jun 11, 2025

Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3.

Python 1,309 160 Updated Jun 10, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Python 5,439 616 Updated Jun 11, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 49,550 7,955 Updated Jun 13, 2025

Official code of ORION

Python 271 18 Updated Apr 22, 2025

FlashMLA: Efficient MLA decoding kernels

Cuda 11,597 841 Updated Apr 29, 2025
Python 130 8 Updated Feb 15, 2025

FlashInfer: Kernel Library for LLM Serving

Cuda 3,171 327 Updated Jun 13, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 15,104 1,987 Updated Jun 13, 2025

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑‍🔬

Jupyter Notebook 11,125 1,626 Updated Apr 26, 2025

Quantized Attention achieves speedup of 2-5x and 3-11x compared to FlashAttention and xformers, without lossing end-to-end metrics across language, image, and video models.

Cuda 1,709 121 Updated Jun 11, 2025

LLMPerf is a library for validating and benchmarking LLMs

Python 933 166 Updated Dec 9, 2024

[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".

Python 486 56 Updated Jun 9, 2025

📚A curated list of Awesome LLM Inference Papers with Codes.

Python 4,118 286 Updated Jun 9, 2025

Awesome LLM compression research papers and tools.

1,558 98 Updated Jun 6, 2025

📱 Collaborative List of Open-Source iOS Apps

45,278 5,562 Updated Jun 13, 2025

A Comprehensive Toolkit for High-Quality PDF Content Extraction

Python 7,849 571 Updated Jan 3, 2025

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…

C++ 10,730 1,495 Updated Jun 13, 2025

Fast and memory-efficient exact attention

Python 17,816 1,741 Updated Jun 10, 2025

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…

Python 34,441 4,929 Updated Jun 12, 2025

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.

Python 40,510 3,211 Updated Jun 12, 2025

 SwiftUI-DesignCode is some examples in the process of learning swiftUI 2.0

Swift 262 29 Updated Apr 18, 2024

LLM based autonomous agent that conducts deep local and web research on any topic and generates a long report with citations.

Python 21,855 2,871 Updated Jun 13, 2025

Universal LLM Deployment Engine with ML Compilation

Python 20,781 1,745 Updated Jun 8, 2025

openpilot is an operating system for robotics. Currently, it upgrades the driver assistance system on 300+ supported cars.

Python 54,080 9,811 Updated Jun 13, 2025

You like pytorch? You like micrograd? You love tinygrad! ❤️

Python 29,387 3,452 Updated Jun 13, 2025

Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and…

TypeScript 35,279 5,911 Updated Mar 25, 2025

Open-source simulator for autonomous driving research.

C++ 12,580 4,062 Updated Jun 13, 2025

Open source simulator for autonomous vehicles built on Unreal Engine / Unity, from Microsoft AI & Research

C++ 17,203 4,725 Updated May 15, 2025

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Python 1 Updated Feb 6, 2024
Next

Footer

© 2025 GitHub, Inc.
0