automatic-speech-recognition

Star

Here are 434 public repositories matching this topic...

wenet-e2e / wenet

Star

Production First and Production Ready End-to-End Speech Recognition Toolkit

pytorch transformer speech-recognition automatic-speech-recognition production-ready whisper asr conformer e2e-models

Updated May 11, 2026
Python

ahmetoner / whisper-asr-webservice

Sponsor

Star

OpenAI Whisper ASR Webservice API

docker speech speech-recognition automatic-speech-recognition speech-to-text asr openai-whisper

Updated Nov 23, 2025
Python

zzw922cn / awesome-speech-recognition-speech-synthesis-papers

Star

Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)

Updated Oct 19, 2023

zzw922cn / Automatic_Speech_Recognition

Star

End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow

audio deep-learning tensorflow paper end-to-end evaluation cnn lstm speech-recognition rnn automatic-speech-recognition feature-vector data-preprocessing phonemes timit-dataset layer-normalization rnn-encoder-decoder chinese-speech-recognition

Updated Mar 24, 2023
Python

coqui-ai / STT

Star

🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.

deep-learning tensorflow voice-recognition speech-recognition automatic-speech-recognition speech-to-text stt asr speech-recognizer speech-recognition-api

Updated Mar 11, 2024
C++

FluidInference / FluidAudio

Star

Frontier CoreML audio models in your apps — text-to-speech, speech-to-text, voice activity detection, and speaker diarization. In Swift, powered by SOTA open source.

audio macos swift ios real-time avfoundation nvidia vad automatic-speech-recognition speech-to-text ane speaker-recognition asr speaker-diarization voice-activity-detection coreml speaker-identification speaker-embedding parakeet

Updated Jun 14, 2026
Swift

TEN-framework / ten-vad

Star

Voice Activity Detector (VAD) : low-latency, high-performance and lightweight

audio real-time voice-commands speech voice-recognition vad automatic-speech-recognition speech-processing conversational-ai voice-activity-detection voice-agent silero-vad

Updated Feb 2, 2026
C

Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics recognition capability.

open-source transformer speech-recognition automatic-speech-recognition asr conformer llm industrial-grade multimodal-llm speechllm

Updated Feb 25, 2026
Python

kakaobrain / pororo

Star

PORORO: Platform Of neuRal mOdels for natuRal language prOcessing

natural-language-processing deep-learning speech-synthesis automatic-speech-recognition neural-models

Updated Mar 23, 2022
Python

TensorSpeech / TensorFlowASR

Star

⚡ TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2. Supported languages that can use characters or subwords

tensorflow speech-recognition jasper automatic-speech-recognition speech-to-text ctc conformer deepspeech2 tflite rnn-transducer end2end tensorflow2 contextnet tflite-model tflite-convertion subword-speech-recognition streaming-transducer

Updated Jun 11, 2025
Python

jitsi / jiwer

Star

Evaluate your speech-to-text system with similarity measures such as word error rate (WER)

python3 automatic-speech-recognition speech-to-text evaluation-metrics wer word-error-rate

Updated Apr 16, 2026
Python

snakers4 / open_stt

Star

Open STT

dataset russian automatic-speech-recognition speech-to-text stt asr

Updated Mar 11, 2022
Python

AutoArk / GPA

Star

[AutoArk] GPA (General Purpose Audio) can do ASR, TTS and voice conversion with one tiny model!

text-to-speech tts transformer automatic-speech-recognition voice-conversion asr vc

Updated May 25, 2026
Python

EmulationAI / awesome-large-audio-models

Star

Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.

music-information-retrieval automatic-speech-recognition speech-to-text audio-processing music-ai music-processing large-language-models foundational-models speech-ai audio-ai large-audio-models speech-llms large-language-model-speech

Updated Jun 3, 2026

shirayu / whispering

Sponsor

Star

Streaming transcriber with whisper

automatic-speech-recognition whisper

Updated May 1, 2023
Python

vilassn / whisper_android

Sponsor

Star

Offline Speech Recognition with OpenAI Whisper and TensorFlow Lite for Android

android text-to-speech mobile embedded translation offline tensorflow tts speech-recognition openai automatic-speech-recognition transcription texttospeech whisper asr transcribe tensorflowlite tflite

Updated Mar 18, 2026
C++

Picovoice / cheetah

Star

On-device streaming speech-to-text engine powered by deep learning

voice-recognition speech-recognition automatic-speech-recognition speech-to-text transcription stt asr online-speech-recognition streaming-speech-to-text

Updated Jun 10, 2026
Python

hirofumi0810 / neural_sp

Star

End-to-end ASR/LM implementation with PyTorch

streaming speech language-modeling pytorch transformer speech-recognition seq2seq attention automatic-speech-recognition sequence-to-sequence language-model attention-mechanism asr ctc rnn-transducer transformer-xl

Updated Aug 30, 2021
Python

FireRedTeam / FireRedASR2S

Star

A SOTA Industrial-Grade All-in-One ASR system with ASR, VAD, LID, and Punc modules. FireRedASR2 supports Chinese (Mandarin, 20+ dialects/accents), English, code-switching, and both speech and singing ASR. FireRedVAD supports speech/singing/music in 100+ langs. FireRedLID supports 100+ langs and 20+ zh dialects. FireRedPunc supports zh and en.

open-source speech-recognition vad automatic-speech-recognition asr lid language-identification sota voice-activity-detection asr-pipeline punctuation-restoration audio-event-classification llm punctuation-prediction industrial-grade multimodal-llm speechllm audio-event-detection

Updated Jun 2, 2026
Python

YoavRamon / awesome-kaldi

Star

This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )

speech speech-recognition awesome-list automatic-speech-recognition speech-to-text kaldi kaldi-asr

Updated Feb 9, 2022

Improve this page

Add a description, image, and links to the automatic-speech-recognition topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the automatic-speech-recognition topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

automatic-speech-recognition

Here are 434 public repositories matching this topic...

wenet-e2e / wenet

ahmetoner / whisper-asr-webservice

zzw922cn / awesome-speech-recognition-speech-synthesis-papers

zzw922cn / Automatic_Speech_Recognition

coqui-ai / STT

FluidInference / FluidAudio

TEN-framework / ten-vad

FireRedTeam / FireRedASR

kakaobrain / pororo

TensorSpeech / TensorFlowASR

jitsi / jiwer

snakers4 / open_stt

AutoArk / GPA

EmulationAI / awesome-large-audio-models

shirayu / whispering

vilassn / whisper_android

Picovoice / cheetah

hirofumi0810 / neural_sp

FireRedTeam / FireRedASR2S

YoavRamon / awesome-kaldi

Improve this page

Add this topic to your repo