8000 entn-at (Ewald Enzinger) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View entn-at's full-sized avatar

Block or report entn-at

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025

Python 3,747 186 Updated May 5, 2025
Python 6 1 Updated Oct 2, 2024

Connect your devices into a secure WireGuard®-based overlay network with SSO, MFA and granular access controls.

Go 13,686 649 Updated May 22, 2025

An efficient implementation of a rate limiter for asyncio.

Python 628 30 Updated Mar 31, 2025

a Neural Vocoder supporting Ring Attention, Conformer and NSF.

Python 18 2 Updated Feb 14, 2025

SAM: Sharpness-Aware Minimization (PyTorch)

Python 1,874 205 Updated Feb 21, 2024
HTML 8 1 Updated Mar 31, 2025

FREECODEC: A DISENTANGLED NEURAL SPEECH CODEC WITH FEWER TOKENS

20 Updated Sep 9, 2024

A collection of all our phonemeizers for dataset construction and inference

Python 22 2 Updated Feb 21, 2025

Code and model for ICASSP 2025 Paper "Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data"

HTML 88 5 Updated Feb 20, 2025

Textbook on reinforcement learning from human feedback

TeX 907 79 Updated May 15, 2025

A system for agentic LLM-powered data processing and ETL

Python 1,962 187 Updated May 21, 2025

Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis

Python 267 33 Updated Mar 12, 2025
Python 12 1 Updated Jan 14, 2025

✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Python 2,301 169 Updated Mar 28, 2025

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS…

C++ 6,058 696 Updated May 22, 2025

A PyTorch native platform for training generative AI models

Python 3,824 375 Updated May 21, 2025

YOLO Face 🚀 in PyTorch

Python 456 41 Updated Mar 14, 2025

Learning about CUDA by writing PTX code.

Python 129 4 Updated Feb 27, 2024

Guiding Frame-Level CTC Alignments Using Self-knowledge Distillation

Python 6 Updated Sep 25, 2024

Profile your CoreML models directly from Python 🐍

Python 27 3 Updated Oct 15, 2024

Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。

Python 1,751 194 Updated Jan 16, 2025
Python 270 38 Updated May 14, 2025

Annotations and scripts for use with University of Wisconsin X-Ray Microbeam Speech Production Database (1994)

Jupyter Notebook 13 1 Updated Oct 8, 2020

Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".

Python 903 59 Updated Oct 28, 2024

Pitch Estimating Neural Networks (PENN)

Python 253 24 Updated Apr 2, 2025
Next
0