smksyj

smksyj

18 followers · 23 following

Lists (32)

Sort

Starred repositories

yifan123 / flow_grpo

An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL

Python 546 19 Updated May 18, 2025

hpcaitech / Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Python 26,444 2,557 Updated Apr 30, 2025

XueZeyue / DanceGRPO

173 3 Updated May 12, 2025

mct10 / RepCodec

Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization

Python 177 11 Updated Jul 12, 2024

bytedance / deer-flow

DeerFlow is a community-driven Deep Research framework, combining language models with tools like web search, crawling, and Python execution, while contributing back to the open-source community.

TypeScript 10,650 991 Updated May 18, 2025

jzq2000 / MoonCast

Python 126 12 Updated Apr 11, 2025

MoonshotAI / Kimi-Audio

Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation

Python 3,607 226 Updated May 8, 2025

anan235 / dia-multilingual

Forked from nari-labs/dia

A TTS model capable of generating ultra-realistic dialogue in one pass.

Python 154 9 Updated Apr 23, 2025

stlohrey / dia-finetuning

Forked from nari-labs/dia

A TTS model capable of generating ultra-realistic dialogue in one pass.

Python 73 10 Updated May 8, 2025

astral-sh / uv

An extremely fast Python package and project manager, written in Rust.

Rust 54,688 1,532 Updated May 18, 2025

devnen / Dia-TTS-Server

Self-host the powerful Dia TTS model. This server offers a user-friendly Web UI, flexible API endpoints (incl. OpenAI compatible), support for SafeTensors/BF16, voice cloning, dialogue generation, …

Python 181 35 Updated May 4, 2025

Lex-au / Orpheus-FastAPI

High-performance Text-to-Speech server with OpenAI-compatible API, 8 voices, emotion tags, and modern web UI. Optimized for RTX GPUs.

Python 366 69 Updated Apr 18, 2025

tuanh123789 / Spark-TTS-finetune

finetune llm part for spark-tts model

Python 70 7 Updated Mar 25, 2025

unslothai / unsloth

Finetune Qwen3, Llama 4, TTS, DeepSeek-R1 & Gemma 3 LLMs 2x faster with 70% less memory! 🦥

Python 38,873 3,045 Updated May 18, 2025

bishabosha / scalar-2025

Ideas and demonstrations of named tuples to the max

Scala 23 Updated Apr 10, 2025

squidfunk / mkdocs-material

Documentation that simply works

Python 23,325 3,752 Updated May 15, 2025

jatcwang / difflicious

Scala library for readable diffs of values

Scala 98 10 Updated May 9, 2025

cubed-dev / cubed

Scalable array processing with bounded memory

Python 194 17 Updated Apr 1, 2025

QwenLM / Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 5,898 443 Updated Aug 7, 2024

FireRedTeam / FireRedASR

Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics rec…

Python 974 75 Updated Mar 27, 2025

stepfun-ai / Step1X-Edit

A SOTA open-source image editing model, which aims to provide comparable performance against the closed-source models like GPT-4o and Gemini 2 Flash.

Python 1,245 56 Updated May 13, 2025

ShadeAlsha / ICon

ICLR 2025 - official implementation for "I-Con: A Unifying Framework for Representation Learning"

Python 85 4 Updated May 16, 2025

shangshang-wang / Tina

Tina: Tiny Reasoning Models via LoRA

Python 217 21 Updated May 14, 2025

x1xhlol / system-prompts-and-models-of-ai-tools

FULL v0, Cursor, Manus, Same.dev, Lovable, Devin, Replit Agent, Windsurf Agent, VSCode Agent, Dia Browser & Trae AI (And other Open Sourced) System Prompts, Tools & AI Models.

48,587 14,922 Updated May 17, 2025

nari-labs / dia

A TTS model capable of generating ultra-realistic dialogue in one pass.

Python 15,837 1,246 Updated May 15, 2025

sake92 / hepek

Typesafe HTML templates and static site generator in pure Scala

Scala 110 10 Updated Apr 15, 2025

sake92 / openapi4s

openapi4s

Scala 23 Updated Apr 4, 2025

SparkAudio / Spark-TTS

Spark-TTS Inference Code

Python 9,420 984 Updated Apr 9, 2025

Lakonik / GMFlow

[ICML 2025] Gaussian Mixture Flow Matching Models (GMFlow)

Python 92 3 Updated May 14, 2025

dangvansam / viet-tts

VietTTS: An Open-Source Vietnamese Text to Speech

Python 52 14 Updated Dec 12, 2024

smksyj

Lists (32)

alg

architecture

audio

awesome

backend

conditioning

diffusion

disentangle

fast_inference

flow

frontend

gan

infra

language

llm

lora

manifold

ml_materials

mlops

MoE

music

nas

neural_ode

optimization

personalization

quantization

Scala

style_transfer

svc

video

vision

web

Starred repositories

Scala

speech-synthesis