Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 9,195 726 Updated May 27, 2025

SWivid / F5-TTS

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 12,442 1,787 Updated Jun 24, 2025

linshenkx / prompt-optimizer

一款提示词优化器，助力于编写高质量的提示词

TypeScript 7,840 990 Updated Jun 24, 2025

SparkAudio / Spark-TTS

Spark-TTS Inference Code

Python 9,885 1,048 Updated Apr 9, 2025

Xiaojiu-z / EasyControl

Implementation of "EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer"

Python 1,570 124 Updated May 27, 2025

prs-eth / thera

Thera: Aliasing-Free Arbitrary-Scale Super-Resolution with Neural Heat Fields

Python 807 55 Updated Apr 30, 2025

deezer / spleeter

Deezer source separation library including pretrained models.

Python 27,035 2,963 Updated Apr 2, 2025

e2b-dev / awesome-ai-agents

A list of AI autonomous agents

18,955 1,455 Updated Feb 26, 2025

SkyworkAI / SkyReels-V1

SkyReels V1: The first and most advanced open-source human-centric video foundation model

Python 2,218 225 Updated Mar 10, 2025

Open-Magic-Video / Magic-1-For-1

Python 747 54 Updated Feb 18, 2025

fal-ai-community / video-starter-kit

Enable AI models for video production in the browser

TypeScript 1,872 223 Updated Jun 12, 2025

GitHubDaily / GitHubDaily

坚持分享 GitHub 上高质量、有趣实用的开源技术教程、开发者工具、编程网站、技术资讯。A list cool, interesting projects of GitHub.

38,668 4,032 Updated Mar 20, 2025

zsyOAOA / InvSR

Arbitrary-steps Image Super-resolution via Diffusion Inversion (CVPR 2025)

Python 1,168 73 Updated Apr 1, 2025

unclecode / crawl4ai

🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN

Python 46,503 4,448 Updated Jun 25, 2025

lehduong / OneDiffusion

Official implementation of OneDiffusion paper (CVPR 2025)

Python 640 19 Updated Dec 14, 2024

yangchris11 / samurai

Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"

Python 6,859 449 Updated Mar 18, 2025

serengil / deepface

A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python

Python 19,485 2,636 Updated Jun 18, 2025

RedAIGC / StoryMaker

StoryMaker: Towards consistent characters in text-to-image generation

Python 703 58 Updated Dec 2, 2024

muzishen / IMAGDressing

[AAAI 2025]👔IMAGDressing👔: Interactive Modular Apparel Generation for Virtual Dressing. It enables customizable human image generation with flexible garment, pose, and scene control, ensuring high …

Python 1,259 111 Updated Jun 2, 2025

THUDM / CogVideo

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 11,603 1,123 Updated Jun 17, 2025

rupeshs / fastsdcpu

Fast stable diffusion on CPU

Python 1,721 154 Updated Jun 12, 2025

hzwer / ECCV2022-RIFE

ECCV2022 - Real-Time Intermediate Flow Estimation for Video Frame Interpolation

Python 4,947 477 Updated May 7, 2025

Kedreamix / Linly-Talker

Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system that combines large language models (LLMs) with visual models to create a novel human-AI interaction…

Python 2,749 445 Updated Feb 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

carlxwz

Block or report carlxwz

Starred repositories

bryanswkim / Chain-of-Zoom

prasunroy / stefann

mswnlz / edu-knowlege

bytedance / DreamO

ace-step / ACE-Step

nari-labs / dia

roboflow / rf-detr

open-mmlab / Amphion