8000 kundan2510 (Kundan Kumar) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View kundan2510's full-sized avatar

Organizations

@lyrebird-ai

Block or report kundan2510

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.

678 42 Updated Aug 3, 2024

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Python 8,405 709 Updated Jun 9, 2025

Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate

Python 609 32 Updated Nov 19, 2024

Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration

Python 1,574 125 Updated Jan 1, 2025

Open Source framework for voice and multimodal conversational AI

Python 6,397 929 Updated Jun 10, 2025

LLM training in simple, raw C/CUDA

Cuda 26,824 3,083 Updated May 10, 2025

Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch

Python 2,554 276 Updated Jan 12, 2025

min(DALL·E) is a fast, minimal port of DALL·E Mini to PyTorch

Python 3,489 252 Updated Apr 28, 2025

Project to play board games like Great Western Trail and Dominant Species online. Backend code for Quarkus, AWS Lambda, DynamoDB. Front end code: https://github.com/tomwetjens/boardgamefiesta-app

Java 4 1 Updated Sep 22, 2022

A flexible source separation library in Python

Python 631 96 Updated Dec 9, 2024

Scripts powering https://infiloop.io/personalstockticker

JavaScript 4 1 Updated Jan 23, 2021

An STFT/iSTFT for PyTorch.

Python 359 52 Updated Oct 31, 2023

GAN-based Mel-Spectrogram Inversion Network for Text-to-Speech Synthesis

Python 1,012 218 Updated Aug 28, 2023

Implementation of "Generating Sequences With Recurrent Neural Networks" https://arxiv.org/abs/1308.0850

Jupyter Notebook 243 35 Updated May 1, 2023

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Python 54,445 8,995 Updated May 30, 2025

gentle forced aligner

Python 1,586 302 Updated May 19, 2025

Using Convnet to classify images of cats from those of dogs. :)

Python 1 Updated Feb 17, 2019

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding a…

Python 2,387 445 Updated Mar 14, 2022

Code for replication of the paper "The relativistic discriminator: a key element missing from standard GAN"

Python 729 103 Updated Mar 12, 2020

Recurrent neural network for audio noise reduction

C 4,724 952 Updated Feb 22, 2025

Send voicified messages on Slack using your vocal avatar!

JavaScript 33 11 Updated Oct 10, 2018

Minimalist Attention-based RNN for NMT (tested on Multi30k)

Python 5 3 Updated May 17, 2018

A domain specific language to express machine learning workloads.

C++ 1,759 212 Updated Apr 28, 2023

PyTorch based Deep Learning Toolbox

Python 204 14 Updated Jul 27, 2018

Basic DQN implementation

Python 225 71 Updated Dec 28, 2017

Decoupled Neural Interfaces using Synthetic Gradients for PyTorch

Python 237 36 Updated Jan 12, 2019

MagPhase Vocoder: Speech analysis/synthesis system for TTS and related applications.

Python 80 31 Updated Oct 14, 2019

A repository of state of the art Deep Learning modules implemented in Tensorflow

Python 5 Updated Aug 18, 2017
Next
0