-
FAST National University
- Islamabad - Pakistan
- https://linkedin.com/in/sadam1195/
Highlights
Stars
We provide a PyTorch implementation of the paper Voice Separation with an Unknown Number of Multiple Speakers In which, we present a new method for separating a mixed audio sequence, in which multiβ¦
serp-ai / bark-with-voice-clone
Forked from suno-ai/barkπ Text-prompted Generative Audio Model - With the ability to clone voices
[ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"
π§βπ« 60+ Implementations/tutorials of deep learning papers with side-by-side notes π; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gaβ¦
AI PDF chatbot agent built with LangChain & LangGraph
Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) by way of Textual Inversion (https://arxiv.org/abs/2208.01618) for Stable Diffusion (https://arxiv.org/abs/2112.10752). Tweaks focuseβ¦
Data & AI Notebook templates catalog organized by tools, following the IMO (input, model, output) framework for easy usage and discovery..
Solving the Traveling Salesman Problem using Self-Organizing Maps
Desktop application for neural speech synthesis written in C++
Notebooks using the Hugging Face libraries π€
Open Source Noise Cancellation App for Virtual Meetings
Retinaface get 80.99% in widerface hard val using mobilenet0.25.
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
A live speech recognition using Facebooks wav2vec 2.0 model.
Transcribing audio files using Hugging Face's implementation of Wav2Vec2 + "chain-linking" NLP tasks to combine speech-to-text with downstream tasks like translation and summarisation.
MARY TTS -- an open-source, multilingual text-to-speech synthesis system written in pure java
Fully functional Pokerbot that works on PartyPoker, PokerStars and GGPoker, scraping tables with Open-CV (adaptable via gui) or neural network and making decisions based on a genetic algorithm and β¦
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
A simple approach to use GPT2-medium (345M) for generating high quality text summaries with minimal training.
spring-media / ForwardTacotron
Forked from fatchord/WaveRNNβ© Generating speech in a single forward pass without any attention!
Grapheme to phoneme conversion with deep learning.
View and control terminals from your browser with end-to-end encryption π
Code & Data for Enhancing Photorealism Enhancement
Aim π« β An easy-to-use & supercharged open-source experiment tracker.
a MUSHRA compliant web audio API based experiment software
PORORO: Platform Of neuRal mOdels for natuRal language prOcessing
Magnificent app which corrects your previous console command.