8000 sovse (Andrey) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View sovse's full-sized avatar

Block or report sovse

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
21 stars written in Python
Clear filter

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 15,139 3,000 Updated Jul 20, 2025

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio…

Python 12,076 1,930 Updated Jun 26, 2025

Data manipulation and transformation for audio signal processing, powered by PyTorch

Python 2,693 701 Updated Jul 20, 2025

[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

Python 1,052 187 Updated Dec 22, 2023

Open STT

Python 799 84 Updated Mar 11, 2022

Evaluate your speech-to-text system with similarity measures such as word error rate (WER)

Python 761 105 Updated Feb 15, 2025

Audio Large Language Models

Python 612 34 Updated Jul 5, 2025

ASR/NLP/TTS deep learning inference library for NVIDIA Jetson using PyTorch and TensorRT

Python 212 50 Updated Feb 9, 2024

FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3

Python 210 17 Updated Apr 20, 2024

Text To Speech Synthesis with Vosk

Python 197 26 Updated Jul 12, 2025

Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization

Python 180 12 Updated Jul 12, 2024

用 OCR 提取视频硬字幕

Python 77 12 Updated Feb 8, 2025
Python 61 24 Updated Jul 20, 2025

Simple Python package for breaking Russian words into syllables

Python 29 10 Updated Feb 20, 2020

The implementation for "Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System".

Python 29 3 Updated May 2, 2025

Speech recognition dataset based on russian audiobook, sentance-level split

Python 18 1 Updated Oct 6, 2018

Simple example how to use tensorflow's CTC loss with Voxforge speech data

Python 18 3 Updated Nov 12, 2016

Recognition of 38 speech commands in russian. Based on Yandex Cup 2021 ML Challenge: ASR

Python 13 1 Updated Dec 22, 2021

This repository contains the scripts for the models of deep unsupervised learning of vocal entrainment

Python 6 Updated Mar 31, 2022

A simple, fully convolutional model for real-time instance segmentation.

Python 3 2 Updated Sep 4, 2020
0