WhiteFu

WhiteFu

speech synthesis & voice conversion & speech enhancement

46 followers · 442 following

codec-bpe Public
Forked from AbrahamSanders/codec-bpe

Implementation of Acoustic BPE (Shen et al., 2024), extended for RVQ-based Neural Audio Codecs

Python MIT License Updated Sep 26, 2024
overseas-website-note Public
Forked from princehuang/overseas-website-note

「海外工具网站」已经是我人生主要事业了，很庆幸还来得及，感谢这个伟大的 AI 时代。

Updated Sep 5, 2024
llm-datasets Public
Forked from mlabonne/llm-datasets

High-quality datasets, tools, and concepts for LLM fine-tuning.

Updated Aug 11, 2024
OpenRLHF Public
Forked from OpenRLHF/OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)

Python Apache License 2.0 Updated Jun 24, 2024
Awesome-LLMs-meet-Multimodal-Generation Public
Forked from YingqingHe/Awesome-LLMs-meet-Multimodal-Generation

🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).

HTML Updated Jun 15, 2024
Mantis Public
Forked from TIGER-AI-Lab/Mantis

Official code for Paper "Mantis: Multi-Image Instruction Tuning"

Python Apache License 2.0 Updated Jun 4, 2024
diarizers Public
Forked from huggingface/diarizers

Python Updated May 22, 2024
MoneyPrinterTurbo Public
Forked from harry0703/MoneyPrinterTurbo

利用AI大模型，一键生成高清短视频 Generate short videos with one click using AI LLM.

Python MIT License Updated May 5, 2024
Bunny Public
Forked from BAAI-DCAI/Bunny

A family of lightweight multimodal models.

Python Apache License 2.0 Updated Apr 24, 2024
lina-speech Public
Forked from theodorblackbird/lina-speech

lina-speech : linear attention based text-to-speech

Jupyter Notebook Other Updated Apr 24, 2024
i-Code Public
Forked from microsoft/i-Code

Jupyter Notebook MIT License Updated Apr 18, 2024
pyvideotrans Public
Forked from jianchang512/pyvideotrans

Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言，并添加配音

Python GNU General Public License v3.0 Updated Apr 11, 2024
llava-phi Public
Forked from xmoanvaf/llava-phi

Python Updated Apr 9, 2024
audio-pipeline Public
Forked from pengzhendong/audio-pipeline

Python Apache License 2.0 Updated Apr 6, 2024
Awesome-LLMs-Datasets Public
Forked from lmmlzn/Awesome-LLMs-Datasets

Summarize existing representative LLMs text datasets.

Apache License 2.0 Updated Apr 6, 2024
FRESCO Public
Forked from williamyang1991/FRESCO

[CVPR 2024] FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation

Jupyter Notebook Other Updated Apr 4, 2024
pytorch-speech-features Public
Forked from apple/pytorch-speech-features

Python Other Updated Apr 2, 2024
pyannote-whisper Public
Forked from yinruiqing/pyannote-whisper

Python Updated Mar 24, 2024
VoiceCraft Public
Forked from jasonppy/VoiceCraft

Zero-Shot Speech Editing and Text-to-Speech in the Wild

Python Other Updated Mar 22, 2024
tts-qa Public
Forked from aixplain/tts-qa

Python Updated Mar 15, 2024
awesome-audio-plaza Public
Forked from metame-ai/awesome-audio-plaza

Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generation

MIT License Updated Mar 11, 2024
ConsistI2V Public
Forked from TIGER-AI-Lab/ConsistI2V

ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation

Python MIT License Updated Mar 9, 2024
SoraReview Public
Forked from lichao-sun/SoraReview

The official GitHub page for the review paper "Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models".

Updated Mar 8, 2024
Open-Sora Public
Forked from hpcaitech/Open-Sora

Building your own video generation model like OpenAI's Sora

Python Apache License 2.0 Updated Mar 8, 2024
EVA Public
Forked from baaivision/EVA

EVA Series: Visual Representation Fantasies from BAAI

Python MIT License Updated Mar 8, 2024
M2UGen Public
Forked from shansongliu/MuMu-LLaMA

This is the official repository for M2UGen

Jupyter Notebook MIT License Updated Mar 7, 2024
ai-audio-startups Public
Forked from csteinmetz1/ai-audio-startups

Community list of startups working with AI in audio and music technology

Apache License 2.0 Updated Mar 6, 2024
54B1 snac Public
Forked from hubertsiuzdak/snac

Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate

Python MIT License Updated Mar 5, 2024
AudioEditingCode Public
Forked from HilaManor/AudioEditingCode

Python Creative Commons Attribution Share Alike 4.0 International Updated Mar 4, 2024
metavoice-src Public
Forked from metavoiceio/metavoice-src

Foundational model for human-like, expressive TTS

Python Apache License 2.0 Updated Mar 1, 2024

WhiteFu

codec-bpe Public

Uh oh!

overseas-website-note Public

Uh oh!

llm-datasets Public

Uh oh!

OpenRLHF Public

Uh oh!

Awesome-LLMs-meet-Multimodal-Generation Public

Uh oh!

Mantis Public

Uh oh!

diarizers Public

Uh oh!

MoneyPrinterTurbo Public

Uh oh!

Bunny Public

Uh oh!

lina-speech Public

Uh oh!

i-Code Public

Uh oh!

pyvideotrans Public

Uh oh!

llava-phi Public

Uh oh!

audio-pipeline Public

Uh oh!

Awesome-LLMs-Datasets Public

Uh oh!

FRESCO Public

Uh oh!

pytorch-speech-features Public

Uh oh!

pyannote-whisper Public

Uh oh!

VoiceCraft Public

Uh oh!

tts-qa Public

Uh oh!

awesome-audio-plaza Public

Uh oh!

ConsistI2V Public

Uh oh!

SoraReview Public

Uh oh!

Open-Sora Public

Uh oh!

EVA Public

Uh oh!

M2UGen Public

Uh oh!

ai-audio-startups Public

Uh oh!

54B1 snac Public

Uh oh!

AudioEditingCode Public

Uh oh!

metavoice-src Public

Uh oh!