Lists (1)
Sort Name ascending (A-Z)
Stars
Flash-Muon: An Efficient Implementation of Muon Optimizer
NVIDIA Linux open GPU with P2P support
A Survey on Data Selection for Language Models
A reading list on LLM based Synthetic Data Generation 🔥
Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"
A framework for few-shot evaluation of language models.
Run GPT model on the browser with WebGPU. An implementation of GPT inference in less than ~1500 lines of vanilla Javascript.
Retro device nc2000/nc2600 emulator (6502 cpu). 文曲星nc2000/nc2600模拟器
🕸 GlotCC Dataset and Pipline -- NeurIPS 2024
MultilingualSIFT: Multilingual Supervised Instruction Fine-tuning
A list of microgrant programs for your good ideas
The data set contains cabinet statements from the South African government. Data was scraped from the governments website: https://www.gov.za/cabinet-statements
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Devon: An open-source pair programmer
A unified library of state-of-the-art model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment…
Minimal example scripts of the Hugging Face Trainer, focused on staying under 150 lines
A playbook for systematically maximizing the performance of deep learning models.
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
hamishivi / EasyLM
Forked from young-geng/EasyLMLarge language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.
A PyTorch native platform for training generative AI models
Instant voice cloning by MIT and MyShell. Audio foundation model.
The original sources of MS-DOS 1.25, 2.0, and 4.0 for reference purposes
OCR, layout analysis, reading order, table recognition in 90+ languages