8000 ease-zh (ease_zh) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View ease-zh's full-sized avatar

Block or report ease-zh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 5,630 422 Updated May 11, 2025
TypeScript 1 Updated Jan 9, 2025

🧑‍🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…

Python 61,856 6,259 Updated Aug 24, 2024

A curated list of peer-reviewed papers on theoretical and practical aspects of drivers' attention used for paper "Attention for Vision-Based Assistive and Automated Driving: A Review of Algorithms …

128 14 Updated May 30, 2025

new large-scale dataset for vision-based drowsiness detection

Python 74 3 Updated Apr 20, 2023

PyTorch implementation of MoCo: https://arxiv.org/abs/1911.05722

Python 5,019 798 Updated Jun 26, 2025

An awesome face technology repository.

HTML 1,277 214 Updated Jun 3, 2022

State-of-the-art 2D and 3D Face Analysis Project

Python 25,852 5,672 Updated Jun 16, 2025

OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。

Python 35,472 3,548 Updated May 31, 2025

Collection of various algorithms in mathematics, machine learning, computer science and physics implemented in C++ for educational purposes.

C++ 32,369 7,431 Updated Jul 9, 2025

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

57,358 6,183 Updated Jun 4, 2025

Controllable and fast Text-to-Speech for over 7000 languages!

Python 1,622 185 Updated Jun 30, 2025

Simple text to phones converter for multiple languages

Python 1,406 189 Updated Sep 26, 2024

eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.

C 5,276 1,055 Updated Jul 12, 2025

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

Python 2,501 191 Updated Mar 31, 2025

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 16,737 1,774 Updated Jul 2, 2025

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Python 7,554 1,368 Updated Dec 6, 2023

YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone

Jupyter Notebook 995 90 Updated Nov 4, 2024

Efficient neural speech synthesis

C 1,178 302 Updated Sep 21, 2024

The Microsoft Scalable Noisy Speech Dataset (MS-SNSD) is a noisy speech dataset that can scale to arbitrary sizes depending on the number of speakers, noise types, and Speech to Noise Ratio (SNR) l…

HTML 537 155 Updated Jul 1, 2024

深度学习经典、新论文逐段精读

30,769 2,674 Updated Mar 22, 2025

A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, …

Python 326 42 Updated Sep 24, 2022

Chinese text normalization for speech processing

Python 690 148 Updated Mar 18, 2023

Production First and Production Ready End-to-End Speech Recognition Toolkit

Python 4,668 1,145 Updated Jul 12, 2025

Tacotron 2 - PyTorch implementation with faster-than-realtime inference

Jupyter Notebook 5,248 1,419 Updated Jun 12, 2024

GAN-based Mel-Spectrogram Inversion Network for Text-to-Speech Synthesis

Python 1,015 218 Updated Aug 28, 2023

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

Python 2,178 537 Updated Jul 27, 2024

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 146,919 29,632 Updated Jul 14, 2025

End-to-End Speech Processing Toolkit

Python 9,285 2,296 Updated Jul 11, 2025
0