- Portland, Oregon
- https://entn.at/
- @entn_at@sigmoid.social
- @entn_at
- @entn-at.bsky.social
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025
Connect your devices into a secure WireGuard®-based overlay network with SSO, MFA and granular access controls.
An efficient implementation of a rate limiter for asyncio.
a Neural Vocoder supporting Ring Attention, Conformer and NSF.
FREECODEC: A DISENTANGLED NEURAL SPEECH CODEC WITH FEWER TOKENS
A collection of all our phonemeizers for dataset construction and inference
Code and model for ICASSP 2025 Paper "Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data"
Textbook on reinforcement learning from human feedback
A system for agentic LLM-powered data processing and ETL
Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis
✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
Speech-to-text, text-to-speech, speaker diarization, speech enhancement, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS…
A PyTorch native platform for training generative AI models
akanametov / yolo-face
Forked from ultralytics/ultralyticsYOLO Face 🚀 in PyTorch
Guiding Frame-Level CTC Alignments Using Self-knowledge Distillation
Profile your CoreML models directly from Python 🐍
Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。
Annotations and scripts for use with University of Wisconsin X-Ray Microbeam Speech Production Database (1994)
Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".
Pitch Estimating Neural Networks (PENN)