The HuggingFace datasets library supports audio datasets, but lacks curated speech recognition benchmark datasets for non-English languages, especially Chinese. FunASR (17.8K+ stars, https://github.com/modelscope/FunASR) provides production-grade ASR models with extensive multilingual support:
- SenseVoice: Ultra-fast multilingual ASR (50+ languages, strong CJK + Cantonese)
- Paraformer: Production-grade Chinese ASR with timestamps and punctuation
Would it be valuable to add FunASR benchmark datasets (e.g., Chinese speech recognition test sets, multilingual ASR evaluation data) to the HuggingFace datasets hub? This would benefit the broader ASR research community and provide standardized evaluation benchmarks beyond the current Whisper-centric datasets.
FunASR models are also available on the HuggingFace model hub (https://huggingface.co/FunAudioLLM), making dataset-model pairing seamless.
Would adding FunASR evaluation datasets be useful?
The HuggingFace datasets library supports audio datasets, but lacks curated speech recognition benchmark datasets for non-English languages, especially Chinese. FunASR (17.8K+ stars, https://github.com/modelscope/FunASR) provides production-grade ASR models with extensive multilingual support:
Would it be valuable to add FunASR benchmark datasets (e.g., Chinese speech recognition test sets, multilingual ASR evaluation data) to the HuggingFace datasets hub? This would benefit the broader ASR research community and provide standardized evaluation benchmarks beyond the current Whisper-centric datasets.
FunASR models are also available on the HuggingFace model hub (https://huggingface.co/FunAudioLLM), making dataset-model pairing seamless.
Would adding FunASR evaluation datasets be useful?