8000 sovse (Andrey) / Starred · GitHub

More Web Proxy on the site http://driver.im/

sovse

Follow

Andrey sovse

Follow

15 followers · 2 following

Russia

Achievements

Achievements

Stars

21 stars written in Python

NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 15,139 3,000 Updated Jul 20, 2025

PaddlePaddle / PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio…

Python 12,076 1,930 Updated Jun 26, 2025

pytorch / audio

Data manipulation and transformation for audio signal processing, powered by PyTorch

Python 2,693 701 Updated Jul 20, 2025

sooftware / conformer

[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

Python 1,052 187 Updated Dec 22, 2023

snakers4 / open_stt

Open STT

Python 799 84 Updated Mar 11, 2022

jitsi / jiwer

Evaluate your speech-to-text system with similarity measures such as word error rate (WER)

Python 761 105 Updated Feb 15, 2025

AudioLLMs / Awesome-Audio-LLM

Audio Large Language Models

Python 612 34 Updated Jul 5, 2025

dusty-nv / jetson-voice

ASR/NLP/TTS deep learning inference library for NVIDIA Jetson using PyTorch and TensorRT

Python 212 50 Updated Feb 9, 2024

lifeiteng / naturalspeech3_facodec

FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3

Python 210 17 Updated Apr 20, 2024

alphacep / vosk-tts

Text To Speech Synthesis with Vosk

Python 197 26 Updated Jul 12, 2025

mct10 / RepCodec

Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization

Python 180 12 Updated Jul 12, 2024

wenet-e2e / wesubtitle

用 OCR 提取视频硬字幕

Python 77 12 Updated Feb 8, 2025

DWCTOD / cv-arxiv-daily

Python 61 24 Updated Jul 20, 2025

Koziev / rusyllab

Simple Python package for breaking Russian words into syllables

Python 29 10 Updated Feb 20, 2020

LingweiMeng / Whisper-Sidecar

The implementation for "Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System".

Python 29 3 Updated May 2, 2025

ainy / shershe

Speech recognition dataset based on russian audiobook, sentance-level split

Python 18 1 Updated Oct 6, 2018

mleimeister / ctc_tensorflow_voxforge

Simple example how to use tensorflow's CTC loss with Voxforge speech data

Python 18 3 Updated Nov 12, 2016

sovse / Speech_38_ru_commands

Recognition of 38 speech commands in russian. Based on Yandex Cup 2021 ML Challenge: ASR

Python 13 1 Updated Dec 22, 2021

nasir0md / unsupervised-learning-entrainment

This repository contains the scripts for the models of deep unsupervised learning of vocal entrainment

Python 6 Updated Mar 31, 2022

luiszeni / yolact_onnx

Forked from dbolya/yolact

A simple, fully convolutional model for real-time instance segmentation.

Python 3 2 Updated Sep 4, 2020

MahmoodGhouri001 / deskew-scanned-images

Python 2 1 Updated Apr 21, 2020

0