Lists (13)
Sort Name ascending (A-Z)
Stars
A single Gradio + React WebUI with extensions for ACE-Step, Kimi Audio, Piper TTS, GPT-SoVITS, CosyVoice, XTTSv2, DIA, Kokoro, OpenVoice, ParlerTTS, Stable Audio, MMS, StyleTTS2, MAGNet, AudioGen, …
Streaming TTS based on Piper with optional RK3588 NPU support
日本の国家予算をインタラクティブに可視化し, 自由に編集しながら試行錯誤し, 自分の考えた予算案をシェアできます
A lightweight pure C++ Text-to-Speech (TTS) pipeline with OpenVINO, supporting multiple languages.
Summary of bugs in Xuantie C9XX core design. include C906/C908/C910/C920
This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025
Build your own touch screen using light triangulation and CIS sensor!
Stable Diffusion in pure C/C++
Qwen2.5-Omni-3B on Axera
Embedded Linux AI Xiaozhi Intelligent voice dialogue.
A library to make M5Stack speak with KeroKero voice
Lightweight LiDAR-Inertial SLAM system for ROS 2. A minimal, dependency-free implementation for research and education.
RAG (Retrieval-Augmented Generation) Chatbot Examples Using PyMuPDF
🌐 WebAgent for Information Seeking bulit by Tongyi Lab: WebWalker & WebDancer & WebSailor https://arxiv.org/pdf/2507.02592
Export the STFT or ISTFT process in ONNX format.
Utilizes ONNX Runtime to transcribe audio into text.
动手学Ollama,CPU玩转大模型部署,在线阅读地址:https://datawhalechina.github.io/handy-ollama/
《大模型白盒子构建指南》:一个全手搓的Tiny-Universe
linux内核学习资料:200+经典内核文章,100+内核论文,50+内核项目,500+内核面试题,80+内核视频
The English version of 14 lectures on visual SLAM.