Stars
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Scira (Formerly MiniPerplx) is a minimalistic AI-powered search engine that helps you find information on the internet and cites it too. Powered by Vercel AI SDK! Search with models like xAI's Grok 3.
Benchmark of Apple MLX operations on all Apple Silicon chips (GPU, CPU) + MPS and CUDA.
prima.cpp: Speeding up 70B-scale LLM inference on low-resource everyday home clusters
A curated and opinionated list of resources for Chief Technology Officers, with the emphasis on startups
Machine Learning Engineering Open Book
No fortress, purely open ground. OpenManus is Coming.
CIFAR-10 speedruns: 94% in 2.6 seconds and 96% in 27 seconds
The simplest implementation of recent Sparse Attention patterns for efficient LLM inference.
A production-ready template to kickstart your Generative AI projects with structure and scalability in mind.
Modular and structured prompt caching for low-latency LLM inference
ZihanWang314 / nano-vllm
Forked from GeeeekExplorer/nano-vllmNano vLLM
FULL v0, Cursor, Manus, Same.dev, Lovable, Devin, Replit Agent, Windsurf Agent, VSCode Agent, Dia Browser, Trae AI & Cluely (And other Open Sourced) System Prompts, Tools & AI Models.
VietTTS: An Open-Source Vietnamese Text to Speech
Witness the aha moment of VLM with less than $3.
Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.
A PyTorch-based model pruning toolkit for pre-trained language models
Reduce the size of pretrained Hugging Face models via vocabulary trimming.
Vocabulary Trimming (VT) is a model compression technique, which reduces a multilingual LM vocabulary to a target language by deleting irrelevant tokens from its vocabulary. This repository contain…
Solve Visual Understanding with Reinforced VLMs
Due to the huge vocaburary size (151,936) of Qwen models, the Embedding and LM Head weights are excessively heavy. Therefore, this project provides a Tokenizer vocabulary shearing solution for Qwen…
🚀ReVisual-R1 is a 7B open-source multimodal language model that follows a three-stage curriculum—cold-start pre-training, multimodal reinforcement learning, and text-only reinforcement learning—to …
EleutherAI / nanoGPT-mup
Forked from karpathy/nanoGPTThe simplest, fastest repository for training/finetuning medium-sized GPTs.
Deep learning for dummies. All the practical details and useful utilities that go into working with real models.