Harness the power of ONNX Runtime to transcribe audio into text effortlessly.
-
Single Model:
- SenseVoiceSmall
- Whisper-Large-V3 / Whisper-Large-V3-Turbo / Whisper-V3-Japanese / Whisper-V3-Turbo-Japanese / CrisperWhisper / Distil-Whisper-large-v3.5 / anime-whisper / whisper-ja-anime...
- Whisper-Large-V2 / Whisper-V2-Japanese ...
- Paraformer-Small-Chinese / Paraformer-Large-Chinese / Paraformer-Large-English
- Paraformer-Online-Streaming-Chinese
- FireRedASR-AED
- Dolphin
-
Combined Models (ASR + Speaker Identify):
- End-to-end speech recognition with built-in
STFT
processing.
Input: Audio file
Output: Transcription result - Seamlessly integrate with these additional tools for improved performance:
- This Whisper does not support automatic language detection. Please specify a target language.
- Visit the project overview for further details.
OS | Device | Backend | Model | Real-Time Factor (Chunk Size: 128000 or 8s) |
---|---|---|---|---|
Ubuntu 24.04 | Laptop | CPU i5-7300HQ |
SenseVoiceSmall f32 |
0.037 |
Ubuntu 24.04 | Laptop | CPU i5-7300HQ |
SenseVoiceSmall q8f32 |
0.075 |
Ubuntu 24.04 | Desktop | CPU i3-12300 |
SenseVoiceSmall f32 |
0.019 |
Ubuntu 24.04 | Desktop | CPU i3-12300 |
SenseVoiceSmall q8f32 |
0.022 |
Ubuntu 24.04 | Desktop | CPU i3-12300 |
SenseVoiceSmall + ERes2NetV2_w24s4ep4 f32 |
0.10 |
Ubuntu 24.04 | Desktop | CPU i3-12300 |
Whisper-Large-v3-en q8f32 |
0.15 |
Ubuntu 24.04 | Desktop | CPU i3-12300 |
Whisper-Large-v3-Turbo-en q8f32 |
0.073 |
Ubuntu 24.04 | Laptop | CPU i5-7300HQ |
Paraformer-Small-Chinese f32 |
0.04 |
Ubuntu 24.04 | Laptop | CPU i5-7300HQ |
Paraformer-Large-English q8f32 |
0.14 |
Ubuntu 24.04 | Desktop | CPU i3-12300 |
Paraformer-Large-Streaming-Chinese f32 |
0.06 Chunk Size: 8800 |
Ubuntu 24.04 | Laptop | CPU i3-12300 |
FireRedASR-AED-L-Chinese q8f32 |
0.17 |
Ubuntu 24.04 | Laptop | CPU i7-1165G7 |
Dolphin-Small q8f32 |
0.14 |
- None
利用 ONNX Runtime 实现音频到文本的高效转录。
-
单模型:
- SenseVoiceSmall
- Whisper-Large-V3 / Whisper-Large-V3-Turbo / Whisper-V3-Japanese / Whisper-V3-Turbo-Japanese / CrisperWhisper / Distil-Whisper-large-v3.5 / anime-whisper / whisper-ja-anime...
- Whisper-Large-V2 / Whisper-V2-Japanese ...
- Paraformer-Small-中文 / Paraformer-Large-中文 / Paraformer-Large-英文
- Paraformer-实时-流式-中文
- FireRedASR-AED
- Dolphin
-
组合模型 (ASR + 讲话者识别):
- 端到端语音识别,内置
STFT
处理。
输入:音频文件
输出:转录结果 - 推荐搭配以下工具,提升性能:
- 此 Whisper 不支持自动语言检测。请指定目标语言。
- 访问项目概览获取更多信息。