-
Northwestern University
- Evanston, IL
- https://yifei-zuo.github.io/
- in/yifei-zuo-5a6138235
Highlights
- Pro
- All languages
- Assembly
- C
- C#
- C++
- CMake
- CSS
- Coq
- Cuda
- Dockerfile
- Emacs Lisp
- Go
- HTML
- Haskell
- Java
- JavaScript
- JetBrains MPS
- Julia
- Jupyter Notebook
- LLVM
- Lean
- Lua
- MATLAB
- MDX
- MLIR
- Makefile
- Markdown
- OCaml
- Prolog
- Python
- Ruby
- Rust
- Sass
- Scala
- Shell
- Standard ML
- Stylus
- Svelte
- Swift
- TeX
- TypeScript
- Typst
- Vim Script
- Zig
Starred repositories
Using PyTorch autograd to compute Hessian of Perplexity for Large Language Models
LeanRL is a fork of CleanRL, where selected PyTorch scripts optimized for performance using compile and cudagraphs.
wolfecameron / nanoMoE
Forked from karpathy/nanoGPTAn extension of the nanoGPT repository for training small MOE models.
Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"
Technical report of Kimina-Prover Preview.
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
A Python-embedded modeling language for convex optimization problems.
A PyTorch native library for large-scale model training
Distributed Triton for Parallel Systems
Distributed Asynchronous Hyperparameter Optimization in Python
AAAI 2022 Paper: Bet even Beth Harmon couldn't learn chess like that :)
Tiled Flash Linear Attention library for fast and efficient mLSTM Kernels.
Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation preconditioner and more)
A Datacenter Scale Distributed Inference Serving Framework
code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"
Pretraining infrastructure for multi-hybrid AI model architectures
This repo is based on https://github.com/jiaweizzhao/GaLore
Computer gaming agents that run on your PC and laptops.
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
TensorDict is a pytorch dedicated tensor container.
DeepEP: an efficient expert-parallel communication library
FlashMLA: Efficient MLA decoding kernels