Starred repositories
This repository contains the Hugging Face Agents Course.
Everything you need to know to build your own RAG application
An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous …
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
FlashMLA: Efficient MLA decoding kernels
A lightweight Python library for simulating Chinese handwriting
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'
📚LeetCUDA: 200+ CUDA/Tensor Cores Kernels, HGEMM, FA-2 MMA.
Xiaomi Home Integration for Home Assistant
SimVQ: Addressing Representation Collapse in Vector Quantized Models with One Linear Layer
🧠 世界上覆盖最全的优秀Qwen提示语大全,欢迎贡献你的提示词。🧠 The most comprehensive collection of excellent Qwen prompts in the world. Feel free to contribute your own prompts!
A playbook for effectively prompting post-trained LLMs
刷算法全靠套路,认准 labuladong 就够了!English version supported! Crack LeetCode, not only how, but also why.
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
A self-learning tutorail for CUDA High Performance Programing.
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
Code sample showing how to run and benchmark models on Qualcomm's Window PCs
AI 基础知识 - GPU 架构、CUDA 编程以及大模型基础知识
Baichuan-Omni: Towards Capable Open-source Omni-modal LLM 🌊
《ThinkDSP》 中文翻译,http://thinkdsp-cn.readthedocs.io/zh_CN/latest/
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.
Universal Tensor Operations in Einstein-Inspired Notation for Python.
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
how to optimize some algorithm in cuda.