8000 sovse (Andrey) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View sovse's full-sized avatar

Block or report sovse

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Whisper-Flamingo [Interspeech 2024] and mWhisper-Flamingo [IEEE SPL 2025] for Audio-Visual Speech Recognition and Translation

Jupyter Notebook 170 11 Updated May 7, 2025

High fidelity, lightweight, end-to-end, streaming, convolution-based neural audio codec

Jupyter Notebook 101 9 Updated Jan 20, 2025

The implementation for "Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System".

Python 27 3 Updated May 2, 2025

Audio Large Language Models

Python 569 33 Updated Jun 2, 2025

Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization

Python 179 12 Updated Jul 12, 2024

FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3

Python 202 17 Updated Apr 20, 2024

Text To Speech Synthesis with Vosk

Python 195 28 Updated May 16, 2025

用 OCR 提取视频硬字幕

Python 76 12 Updated Feb 8, 2025

Recognition of 38 speech commands in russian. Based on Yandex Cup 2021 ML Challenge: ASR

Python 13 1 Updated Dec 22, 2021

Russian speech technology links

315 21 Updated May 17, 2025

Deep Learning Autonomous Car based on Raspberry Pi, SunFounder PiCar-V Kit, TensorFlow, and Google's EdgeTPU Co-Processor

Jupyter Notebook 409 273 Updated May 18, 2025
Python 61 24 Updated Jun 22, 2025
Jupyter Notebook 40 22 Updated Feb 14, 2025

A simple, fully convolutional model for real-time instance segmentation.

Python 3 2 Updated Sep 4, 2020

Evaluate your speech-to-text system with similarity measures such as word error rate (WER)

Python 743 103 Updated Feb 15, 2025

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio…

Python 12,005 1,920 Updated Jun 10, 2025

A custom micropython firmware integrating tensorflow lite for microcontrollers and ulab to implement the tensorflow micro examples.

C 184 91 Updated Feb 18, 2025

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 14,891 2,951 Updated Jun 23, 2025

Data manipulation and transformation for audio signal processing, powered by PyTorch

Python 2,679 694 Updated Jun 22, 2025

ASR/NLP/TTS deep learning inference library for NVIDIA Jetson using PyTorch and TensorRT

Python 208 50 Updated Feb 9, 2024

Simple Python package for breaking Russian words into syllables

Python 29 10 Updated Feb 20, 2020

[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

Python 1,040 185 Updated Dec 22, 2023

This repository contains the scripts for the models of deep unsupervised learning of vocal entrainment

Python 6 Updated Mar 31, 2022
Jupyter Notebook 29 11 Updated May 5, 2024

License plate recognition . Model training and conversion to tflite

Jupyter Notebook 40 9 Updated Mar 11, 2021

Open STT

Python 798 84 Updated Mar 11, 2022

Распознавание речи русского языка используя Tensorflow, обучаясь на базе Voxforge

Jupyter Notebook 58 10 Updated Sep 16, 2022

End-to-end speech to text recognition

Jupyter Notebook 3 2 Updated Oct 27, 2017

Speech recognition dataset based on russian audiobook, sentance-level split

Python 18 1 Updated Oct 6, 2018
Next
0