-
University of Science and Technology of China
Highlights
- Pro
Stars
免费用户cursor突破claude-3.7的限制,pro用户不建议搞。仅供学习,给个星星
🔥🔥DatalinkX异构数据源之间的数据同步系统,支持海量数据的增量或全量同步,同时支持HTTP、Oracle、MySQL、ES等数据源之间的数据流转,支持中间transform算子如SQL算子、大模型算子,底层依赖Flink、Seatunnel引擎,提供流转任务管理、任务级联配置、任务日志采集等功能🔥🔥
Adaptive Attention Sparsity with Hierarchical Top-p Pruning
Progressive Sparse Attention (PSA): Algorithm and System Co-design for Efficient Attention in LLM Serving
A lightweight java framework designed for building efficient and scalable applications. Supports both command-line tools and APIs, enabling developers to create robust solutions with ease.
SGLang is a fast serving framework for large language models and vision language models.
高级计算机体系结构2020,吴俊敏老师,中科大研究生课程
A high-throughput and memory-efficient inference and serving engine for LLMs
📚A curated list of Awesome LLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.
Tested in Windows 10 & 11, 4K (125%, 150%, 200%). With 2 versions, 2 types and 3 different sizes!
llama3 implementation one matrix multiplication at a time
NetLLM: Adapting Large Language Models for Networking (SIGCOMM 2024) - Official Repository
深度学习面试宝典(含数学、机器学习、深度学习、计算机视觉、自然语言处理和SLAM等方向)
Andrew Ng机器学习对应Python Jupyter Notebook
机器学习、深度学习的学习路径及知识总结
Javascript library for precise tracking of facial features via Constrained Local Models