Hi! TEN framework is an excellent platform for building conversational voice AI agents.
I noticed there's been community interest in FunASR integration (#1509). I'd like to formally suggest adding a funasr_stt_python extension alongside the existing whisper_stt_python:
Why FunASR for TEN:
- 170x real-time GPU speed: Critical for low-latency voice agents
- Native streaming ASR: Paraformer-streaming designed for real-time with sub-second latency
- Built-in VAD + punctuation: Simplifies the audio pipeline
- 50+ languages: SenseVoice model with automatic language detection
- Speaker diarization + emotion detection: Rich metadata for agent responses
- OpenAI-compatible API:
funasr-server --device cuda
FunASR is already widely used in voice agent frameworks (Fay 12.8K stars, Pipecat 12.5K stars, LiveKit 10.7K stars).
Quick integration:
from funasr import AutoModel
model = AutoModel(model="iic/SenseVoiceSmall")
result = model.generate(input=audio_bytes)
Happy to contribute a PR with the extension implementation!
Hi! TEN framework is an excellent platform for building conversational voice AI agents.
I noticed there's been community interest in FunASR integration (#1509). I'd like to formally suggest adding a
funasr_stt_pythonextension alongside the existingwhisper_stt_python:Why FunASR for TEN:
funasr-server --device cudaFunASR is already widely used in voice agent frameworks (Fay 12.8K stars, Pipecat 12.5K stars, LiveKit 10.7K stars).
Quick integration:
Happy to contribute a PR with the extension implementation!