Lists (3)
Sort Name ascending (A-Z)
Stars
Official repository of SepReformer for speech separation
Dolphin is a multilingual, multitask ASR model jointly trained by DataoceanAI and Tsinghua University.
面向开发者的 LLM 入门教程,吴恩达大模型系列课程中文版
Awesome-LLM: a curated list of Large Language Model
这是一份入门AI/LLM大模型的逐步指南,包含教程和演示代码,带你从API走进本地大模型部署和微调,代码文件会提供Kaggle或Colab在线版本,即便没有显卡也可以进行学习。项目中还开设了一个小型的代码游乐场🎡,你可以尝试在里面实验一些有意思的AI脚本。同时,包含李宏毅 (HUNG-YI LEE)2024生成式人工智能导论课程的完整中文镜像作业。
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
SearXNG is a free internet metasearch engine which aggregates results from various search services and databases. Users are neither tracked nor profiled.
Open-Unmix - Music Source Separation for PyTorch
Implementation of Band Split Roformer, SOTA Attention network for music source separation out of ByteDance AI Labs
Code for Audio-Visual Target Speaker Extraction with Selective Auditory Attention (TASLP)
PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538
A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models
🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation
Scaling Diffusion Transformers with Mixture of Experts
PyTorch implementation of SimCLR: A Simple Framework for Contrastive Learning of Visual Representations
Implementation of Denoising Diffusion Probabilistic Model in Pytorch
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
A Collection of Variational Autoencoders (VAE) in PyTorch.
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
An implementation of EQFace: A Simple Explicit Quality Network for Face Recognition (https://arxiv.org/abs/2105.00634, CVPRW 2021)
A pytorch implementation of the vector quantized variational autoencoder (https://arxiv.org/abs/1711.00937)
unofficial implementation of the High Fidelity Neural Audio Compression
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
(ICCV 2019) Uncertainty-aware Face Representation and Recognition
OSUM: Open Speech Understanding Model, open-sourced by ASLP@NPU.
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
⏰ Collaboratively track deadlines of conferences recommended by CCF (Website, Python Cli, Wechat Applet) / If you find it useful, please star this project, thanks~