-
Shanghai University (SHU)
- Baoshan, Shanghai
-
05:21
(UTC +08:00) - https://drewjin.github.io/
- https://scholar.google.com.hk/citations?user=L220uBgAAAAJ&hl=zh-CN
Highlights
- Pro
Stars
HPC-SJTU / xfold
Forked from Shenggan/xfoldDemocratizing AlphaFold3: an PyTorch reimplementation to accelerate protein structure prediction
Official PyTorch implementation for the paper AutoJudge: Judge Decoding Without Manual Annotation
Democratizing AlphaFold3: an PyTorch reimplementation to accelerate protein structure prediction
verl: Volcano Engine Reinforcement Learning for LLMs
A framework for few-shot evaluation of language models.
Shanghai University Computer Architecture Experiments
Machine Learning Engineering Open Book
A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training
Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache
Collect the awesome works evolved around reasoning models like O1/R1 in visual domain
😎 A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond
PoC for "SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning" [arXiv '25]
An extremely fast Python package and project manager, written in Rust.
SGLang is a fast serving framework for large language models and vision language models.
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
A high-performance, fully-featured CSV parser and serializer for modern C++.
MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba. Full multimodal LLM Android App:[MNN-LLM-Android](./apps/Android/MnnLlmChat/READ…
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA/Tensor Cores Kernels, HGEMM, FA-2 MMA etc.🔥
Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3.
Cold Compress is a hackable, lightweight, and open-source toolkit for creating and benchmarking cache compression methods built on top of GPT-Fast, a simple, PyTorch-native generation codebase.
📖A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, FlashAttention, PagedAttention, MLA, Parallelism, etc. 🎉🎉
Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search