Skip to content

Feature Request: Add FunASR STT extension #2174

Description

@LauraGPT

Hi! TEN framework is an excellent platform for building conversational voice AI agents.

I noticed there's been community interest in FunASR integration (#1509). I'd like to formally suggest adding a funasr_stt_python extension alongside the existing whisper_stt_python:

Why FunASR for TEN:

  • 170x real-time GPU speed: Critical for low-latency voice agents
  • Native streaming ASR: Paraformer-streaming designed for real-time with sub-second latency
  • Built-in VAD + punctuation: Simplifies the audio pipeline
  • 50+ languages: SenseVoice model with automatic language detection
  • Speaker diarization + emotion detection: Rich metadata for agent responses
  • OpenAI-compatible API: funasr-server --device cuda

FunASR is already widely used in voice agent frameworks (Fay 12.8K stars, Pipecat 12.5K stars, LiveKit 10.7K stars).

Quick integration:

from funasr import AutoModel
model = AutoModel(model="iic/SenseVoiceSmall")
result = model.generate(input=audio_bytes)

Happy to contribute a PR with the extension implementation!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions