-
HuggingFace
- France
- 3outeille.github.io
- @FerdinandMom
- @FerdinandMom
Lists (5)
Sort Name ascending (A-Z)
Stars
- All languages
- Ada
- Assembly
- C
- C#
- C++
- CMake
- CSS
- Cuda
- Dockerfile
- Emacs Lisp
- Forth
- GLSL
- Go
- HCL
- HTML
- Java
- JavaScript
- Julia
- Jupyter Notebook
- Just
- Kotlin
- LLVM
- Lua
- MATLAB
- MDX
- MLIR
- Makefile
- Markdown
- OCaml
- Objective-C
- PHP
- PowerShell
- Python
- R
- Ruby
- Rust
- SCSS
- Shell
- Swift
- SystemVerilog
- TeX
- < 10000 a href="/3outeille?language=typescript&tab=stars" tabindex="-1" id="item-496763d0-1b3e-43d4-879e-4f7e0420ef12" role="menuitemradio" aria-checked="false" data-view-component="true" class="ActionListContent"> TypeScript
- VHDL
- Verilog
- Vim Script
- XSLT
- Zig
Simple MPI implementation for prototyping or learning
Experimental repository for research implementation of NoLoCo.
AXI, AXI stream, Ethernet, and PCIe components in System Verilog
Research sandbox for decentralized pipelined inference
Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse environments
prime-rl is a codebase for decentralized async RL training at scale
Scripts and instructions for replicating the original FineWeb experiments on LUMI
Analyze computation-communication overlap in V3/R1.
A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
DeepEP: an efficient expert-parallel communication library
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
Multi-Threaded FP32 Matrix Multiplication on x86 CPUs
A visual playground for agentic workflows: Iterate over your agents 10x faster
Enable AI models for video production in the browser
Fully open reproduction of DeepSeek-R1
π Efficient implementations of state-of-the-art linear attention models
Minimalistic 4D-parallelism distributed training framework for education purpose
CGRA-Flow is an integrated framework for CGRA compilation, exploration, synthesis, and development.
A baseline repository of Auto-Parallelism in Training Neural Networks
alibaba / Megatron-LLaMA
Forked from NVIDIA/Megatron-LMBest practice for training LLaMA models in Megatron-LM
The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.
Explore training for quantized models