-
DCST, Tsinghua University
- Beijing, China
- https://blog.sengxian.com/
Highlights
- Pro
Stars
A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training
DeepEP: an efficient expert-parallel communication library 8000
Node.js + JavaScript reference client for the Realtime API (beta)
CodeGeeX4-ALL-9B, a versatile model for all AI software development scenarios, including code completion, code interpreter, web search, function calling, repository-level Q&A and much more.
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
AgentTuning: Enabling Generalized Agent Abilities for LLMs
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
CodeGeeX2: A More Powerful Multilingual Code Generation Model
Fast and memory-efficient exact attention
ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
🩺 首个会看胸部X光片的中文多模态医学大模型 | The first Chinese Medical Multimodal Model that Chest Radiographs Summarization.
Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型
A new markup-based typesetting system that is powerful and easy to learn.
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory…
GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)
Human ChatGPT Comparison Corpus (HC3), Detectors, and more! 🔥
[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Wireguard client that exposes itself as a socks5 proxy
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
🗂️A file list/WebDAV program that supports multiple storages, powered by Gin and Solidjs. / 一个支持多存储的文件列表/WebDAV程序,使用 Gin 和 Solidjs。
VDI Stream Client is a very tiny, low latency and GPU accelerated client to connect to Windows running Parsec Host.
Live streaming player, iOS+Android, RTMP/HTTP-FLV/HLS/WebRTC, by Flutter+SRS.