Starred repositories
A collection of vision-language-action model post-training methods.
✨✨latest advancements in VLA models(VIsion Language Action)
openvla / openvla
Forked from TRI-ML/prismatic-vlmsOpenVLA: An open-source vision-language-action model for robotic manipulation.
[Embodied-AI-Survey-2025] Paper List and Resource Repository for Embodied AI
An AI agent powered by LLMs that streamlines the entire process of data analysis. 🚀
🤖 Assemble, configure, and deploy autonomous AI Agents in your browser.
This package contains the original 2012 AlexNet code.
AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents
No fortress, purely open ground. OpenManus is Coming.
A live stream development of RL tunning for LLM agents
Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.
A generative speech model for daily dialogue.
Best practice TTS based on BERT and VITS with some Natural Speech Features Of Microsoft; Support ONNX streaming out!
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…
INTERSPEECH 2023-2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023-24 conference. Explore the latest advances in speech and language processin…
Utilize the unlimited free GPT-3.5-Turbo API service provided by the login-free ChatGPT Web.
Awesome speech/audio LLMs, representation learning, and codec models
Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3
✅ Solutions to LeetCode by Go, 100% test coverage, runtime beats 100% / LeetCode 题解
huangxu1991 / GPT-SoVITS-VC
Forked from RVC-Boss/GPT-SoVITSVC Without Retrain!
Easily train a good VC model with voice data <= 10 mins!
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch