8000 Oliver-ss (Song) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View Oliver-ss's full-sized avatar
  • Duke University
  • Shanghai

Block or report Oliver-ss

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

TradingAgents: Multi-Agents LLM Financial Trading Framework

Python 15,425 2,594 Updated Jul 8, 2025

Awesome curated collection of images and prompts generated by GPT-4o and gpt-image-1. Explore AI generated visuals created with ChatGPT and Sora, showcasing OpenAI’s advanced image generation capab…

JavaScript 6,575 600 Updated May 26, 2025

Cost-efficient and pluggable Infrastructure components for GenAI inference

Go 3,923 397 Updated Jul 16, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,871 280 Updated May 15, 2025
Python 4,413 358 Updated Jun 12, 2025

[MLSys'25] QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving; [MLSys'25] LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

C++ 719 48 Updated Mar 6, 2025

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 4,242 294 Updated Jul 14, 2025

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 6,018 560 Updated Apr 11, 2025

🚀 PR-Agent (Qodo Merge open-source): An AI-Powered 🤖 Tool for Automated Pull Request Analysis, Feedback, Suggestions and More! 💻🔍

Python 8,400 981 Updated Jul 16, 2025

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…

C++ 11,031 1,592 Updated Jul 16, 2025

Fast and memory-efficient exact attention

Python 18,393 1,814 Updated Jul 15, 2025

Samples for CUDA Developers which demonstrates features in CUDA Toolkit

C 7,746 2,074 Updated May 22, 2025

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Python 4,897 522 Updated Apr 11, 2025

🧡 Everything is RSSible

TypeScript 37,772 8,314 Updated Jul 16, 2025

免费的公众号 RSS,支持扩展任意 APP

JavaScript 2,100 88 Updated Jul 5, 2023

Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training

C++ 1,809 240 Updated Jul 12, 2025

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 38,024 6,602 Updated Jul 16, 2025

Backup & export all Evernote notes and notebooks

Python 1,212 87 Updated Apr 28, 2025

《Hello 算法》:动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新,English version in translation

Java 114,399 14,165 Updated Jul 9, 2025

RayLLM - LLMs on Ray (Archived). Read README for more info.

1,259 92 Updated Mar 13, 2025

Running BERT without Padding

C++ 472 54 Updated Mar 18, 2022

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 3,151 263 Updated Jul 10, 2025

A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer

C++ 93 24 Updated Jul 14, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 52,383 8,735 Updated Jul 16, 2025

A guidance language for controlling large language models.

Jupyter Notebook 20,476 1,113 Updated Jul 16, 2025

Official github repo for C-Eval, a Chinese evaluation suite for foundation models [NeurIPS 2023]

Python 1,754 82 Updated Oct 26, 2023

Neovim config for the lazy

Lua 21,639 1,514 Updated May 12, 2025

LlamaIndex is the leading framework for building LLM-powered agents over your data.

Python 43,130 6,198 Updated Jul 16, 2025

Several simple examples for popular neural network toolkits calling custom CUDA operators.

Python 1,484 2DEA 198 Updated Apr 29, 2021
Next
0