-
Panasonic Research and Development Center Singapore
- Singapore
Stars
8000MobileFaceNets: Efficient CNNs for Accurate Real-Time Face Verification on Mobile 10000 Devices
🐳 A curated list of Docker resources and projects
A feature-rich command-line audio/video downloader
target speaker extraction and verification for multi-talker speech
speech emotion recognition using a convolutional recurrent networks based on IEMOCAP
Python library for Room Impulse Response (RIR) simulation with GPU acceleration
SLAM - Simultaneous localization and mapping using OpenCV and NumPy.
Speech Toolkit for Malaysian language, https://malaya-speech.readthedocs.io/
This repository contains a multi-fisheye camera SLAM. The underlying SLAM system is based on ORB-SLAM.
MelGAN vocoder (compatible with NVIDIA/tacotron2)
DSO with SIM(3) pose graph optimization and loop closure
FBOW (Fast Bag of Words) is an extremmely optimized version of the DBow2/DBow3 libraries.
Large, modern dataset for speech recognition
Speaker embedding (d-vector) trained with GE2E loss
A Python library for audio data augmentation. Useful for making audio ML models work well in the real world, not just in the lab.
speaker diarization by uis-rnn and speaker embedding by vgg-speaker-recognition
A dataset for estimation of hand pose and shape from single color images.
Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.
[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)
Tools for handling multimodal data in machine learning projects.
A librosa STFT/Fbank/mfcc feature extration written up in PyTorch using 1D Convolutions.
Official repository for RawNet, RawNet2, and RawNet3