-
p(doom)
- Munich
-
10:04
(UTC +02:00) - srambical.fr
- @lemergenz
- in/franz-srambical-418630178
Highlights
- Pro
-
pdoom.org Public
A grassroots initiative on A(G)I research disregarding dumb societal gatekeeping mechanisms.
JavaScript Apache License 2.0 UpdatedJun 23, 2025 -
Stoix Public
Forked from EdanToledo/Stoix🏛️A research-friendly codebase for fast experimentation of single-agent reinforcement learning in JAX • End-to-End JAX RL
Python Apache License 2.0 UpdatedJun 13, 2025 -
jafar Public
Forked from FLAIROx/jafarJAX reimplementation of the DeepMind paper "Genie: Generative Interactive Environments"
-
-
TinyZero Public
Forked from Jiayi-Pan/TinyZeroClean, minimal, accessible reproduction of DeepSeek R1-Zero
Python Apache License 2.0 UpdatedMay 4, 2025 -
jax Public
Forked from jax-ml/jaxComposable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
Python Apache License 2.0 UpdatedApr 21, 2025 -
-
nano-aha-moment Public
Forked from McGill-NLP/nano-aha-momentSingle File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"
Jupyter Notebook MIT License UpdatedApr 17, 2025 -
-
mle-scheduler Public
Forked from mle-infrastructure/mle-schedulerLightweight Cluster/Cloud VM Job Management 🚀
Python MIT License UpdatedApr 11, 2025 -
tuning_playbook Public
Forked from google-research/tuning_playbookA playbook for systematically maximizing the performance of deep learning models.
Other UpdatedApr 10, 2025 -
submitit Public
Forked from facebookincubator/submititPython 3.8+ toolbox for submitting jobs to Slurm
Python MIT License UpdatedApr 8, 2025 -
-
scaling-book Public
Forked from jax-ml/scaling-bookHome for "How To Scale Your Model", a short blog-style textbook about scaling LLMs on TPUs
HTML MIT License UpdatedMar 22, 2025 -
grain Public
Forked from google/grainLibrary for reading and processing ML training data.
Python Apache License 2.0 UpdatedMar 10, 2025 -
miryoku_zmk Public
Forked from manna-harbour/miryoku_zmkMiryoku is an ergonomic, minimal, orthogonal, and universal keyboard layout. Miryoku ZMK is the Miryoku implementation for ZMK.
C UpdatedFeb 19, 2025 -
sway-cursor Public
A sway-native keyboard-driven cursor with pointer acceleration.
Python MIT License UpdatedFeb 16, 2025 -
-
minimo Public
Forked from gpoesia/minimoLearning Formal Mathematics from Intrinsic Motivation
Rust MIT License UpdatedOct 31, 2024 -
mup-lr-warmup Public
We investigate the impact of learning rate warmup on GPT-style Transformers using muP/SP trained on a realistic repository (hlb-gpt) on language modeling.
Python Apache License 2.0 UpdatedAug 21, 2024 -
DeepSeek-Prover-V1.5 Public
Forked from deepseek-ai/DeepSeek-Prover-V1.5Python MIT License UpdatedAug 16, 2024 -
aerospace.toml Public
Default config, but with cmd as modifier + input mode (binding mode without bindings) to circumvent clashes with OS bindings.
UpdatedAug 12, 2024 -
hlb-gpt-mup-warmup Public
Forked from tysam-code/hlb-gptMinimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wikitext-103 on a single A100 in <100 seconds. Scales to large…
Python Apache License 2.0 UpdatedJul 29, 2024 -
ezmup Public
Forked from cloneofsimo/ezmupSimple implementation of muP, based on Spectral Condition for Feature Learning
Python UpdatedJul 28, 2024 -
mup_transformer_warmup Public
Investigation of whether we can omit/ shorten lr warmup under muP.
-
mup Public
Forked from microsoft/mupmaximal update parametrization (µP)
Jupyter Notebook MIT License UpdatedJul 17, 2024 -
maxtext Public
Forked from AI-Hypercomputer/maxtextA simple, performant and scalable Jax LLM!
Python Apache License 2.0 UpdatedJul 16, 2024 -
transformers Public
Forked from huggingface/transformers🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Python Apache License 2.0 UpdatedJun 23, 2024 -
modded-nanogpt-mup-transformer-warmup Public
Forked from KellerJordan/modded-nanogptGPT-2 (124M) quality in 5B tokens. Do we need lr warmup under muP?
Python UpdatedJun 19, 2024 -
bs-mask Public
An attention implementation that uses the causal mask, shifts the queries 'to the right', adjusts the RoPE encodings accordingly and removes the padding tokens from the output. Empirically collapse…
Python UpdatedMay 27, 2024