8000 Oliver-ss (Song) / Starred · GitHub

More Web Proxy on the site http://driver.im/

Oliver-ss

Follow

Song Oliver-ss

Follow

11 followers · 4 following

Duke University
Shanghai

Achievements

Achievements

Stars

TauricResearch / TradingAgents

TradingAgents: Multi-Agents LLM Financial Trading Framework

Python 15,425 2,594 Updated Jul 8, 2025

jamez-bondos / awesome-gpt4o-images

Awesome curated collection of images and prompts generated by GPT-4o and gpt-image-1. Explore AI generated visuals created with ChatGPT and Sora, showcasing OpenAI’s advanced image generation capab…

JavaScript 6,575 600 Updated May 26, 2025

vllm-project / aibrix

Cost-efficient and pluggable Infrastructure components for GenAI inference

Go 3,923 397 Updated Jul 16, 2025

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,871 280 Updated May 15, 2025

stepfun-ai / Step-Audio

Python 4,413 358 Updated Jun 12, 2025

mit-han-lab / omniserve

[MLSys'25] QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving; [MLSys'25] LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

C++ 719 48 Updated Mar 6, 2025

xlite-dev / Awesome-LLM-Inference

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 4,242 294 Updated Jul 14, 2025

pytorch-labs / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 6,018 560 Updated Apr 11, 2025

qodo-ai / pr-agent

🚀 PR-Agent (Qodo Merge open-source): An AI-Powered 🤖 Tool for Automated Pull Request Analysis, Feedback, Suggestions and More! 💻🔍

Python 8,400 981 Updated Jul 16, 2025

NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…

C++ 11,031 1,592 Updated Jul 16, 2025

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 18,393 1,814 Updated Jul 15, 2025

NVIDIA / cuda-samples

Samples for CUDA Developers which demonstrates features in CUDA Toolkit

C 7,746 2,074 Updated May 22, 2025

AutoGPTQ / AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Python 4,897 522 Updated Apr 11, 2025

DIYgod / RSSHub

🧡 Everything is RSSible

TypeScript 37,772 8,314 Updated Jul 16, 2025

feeddd / feeds

免费的公众号 RSS，支持扩展任意 APP

JavaScript 2,100 88 Updated Jul 5, 2023

flexflow / flexflow-train

Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training

C++ 1,809 240 Updated Jul 12, 2025

ray-project / ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 38,024 6,602 Updated Jul 16, 2025

vzhd1701 / evernote-backup

Backup & export all Evernote notes and notebooks

Python 1,212 87 Updated Apr 28, 2025

krahets / hello-algo

《Hello 算法》：动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新，English version in translation

Java 114,399 14,165 Updated Jul 9, 2025

anyscale / llm-continuous-batching-benchmarks

Python 120 22 Updated Mar 17, 2024

ray-project / ray-llm

RayLLM - LLMs on Ray (Archived). Read README for more info.

1,259 92 Updated Mar 13, 2025

bytedance / effective_transformer

Running BERT without Padding

C++ 472 54 Updated Mar 18, 2022

mit-han-lab / llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 3,151 263 Updated Jul 10, 2025

tlc-pack / cutlass_fpA_intB_gemm

A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer

C++ 93 24 Updated Jul 14, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 52,383 8,735 Updated Jul 16, 2025

guidance-ai / guidance

A guidance language for controlling large language models.

Jupyter Notebook 20,476 1,113 Updated Jul 16, 2025

hkust-nlp / ceval

Official github repo for C-Eval, a Chinese evaluation suite for foundation models [NeurIPS 2023]

Python 1,754 82 Updated Oct 26, 2023

LazyVim / LazyVim

Neovim config for the lazy

Lua 21,639 1,514 Updated May 12, 2025

run-llama / llama_index

LlamaIndex is the leading framework for building LLM-powered agents over your data.

Python 43,130 6,198 Updated Jul 16, 2025

godweiyang / NN-CUDA-Example

Several simple examples for popular neural network toolkits calling custom CUDA operators.

Python 1,484 2DEA 198 Updated Apr 29, 2021

0