feat(assemblyai): add language_code streaming param for language steering#6219
Open
gsharp-aai wants to merge 2 commits into
Open
feat(assemblyai): add language_code streaming param for language steering#6219gsharp-aai wants to merge 2 commits into
gsharp-aai wants to merge 2 commits into
Conversation
0d6f56c to
b9de240
Compare
Contributor
Author
…ring Adds a language_code connect-time parameter to the AssemblyAI STT plugin, steering transcription toward a specific language (e.g. 'en', 'es') instead of automatic detection/code-switching. The plugin previously only exposed language_detection (an output toggle), with no way to steer language, even though the AssemblyAI streaming API accepts language_code at connect time. Language steering is applied by the u3-pro ASR, so language_code is gated to the u3-rt-pro family (u3-rt-pro, u3-rt-pro-beta-1, universal-3-5-pro) via the existing _U3_PRO_MODELS validation, matching how mode and voice_focus are handled. Connect-time only, matching the AssemblyAI streaming API, where language_code is not part of UpdateConfiguration.
…O 639-1 normalization
- Type language_code as LanguageCode (accepting str input) so common
formats ('en-US', 'english') are normalized to a bare ISO 639-1 code
before being sent, matching the language steering expectation.
- Use the "Universal-3 Pro family" umbrella term in the docstring.
- Add a normalization test covering region/name/ISO input forms.
b9de240 to
f75c01f
Compare
| "universal-3-5-pro", | ||
| ] = "universal-3-5-pro", | ||
| language_detection: NotGivenOr[bool] = NOT_GIVEN, | ||
| language_code: NotGivenOr[LanguageCode | str] = NOT_GIVEN, |
Member
There was a problem hiding this comment.
i think we can just accept a str here so we don't need to normalize it like this later: LanguageCode(LanguageCode(language_code).language)
Suggested change
| language_code: NotGivenOr[LanguageCode | str] = NOT_GIVEN, | |
| language_code: NotGivenOr[str] = NOT_GIVEN, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
Adds a
language_codeconnect-time parameter to the AssemblyAI STT plugin so users can steer transcription toward a specific language (e.g."en","es","fr") instead of relying on automatic detection / code-switching.Today the plugin only exposes
language_detection, which is an output toggle (whetherlanguage_code/language_confidenceare returned on turn messages) — there is no way to steer the model toward a language. The AssemblyAI streaming API already acceptslanguage_codeas a connect-time parameter, so this just plumbs it through.This is useful for known-monolingual sessions, where steering improves accuracy on short/ambiguous utterances (e.g. disambiguating "see" vs. "si").
Details
language_code: NotGivenOr[str]toSTTOptionsand theSTT.__init__signature, forwarded into the connect-timelive_configquerystring (omitted when unset).u3-rt-pro,u3-rt-pro-beta-1,universal-3-5-pro) via the existing_U3_PRO_MODELSvalidation — passing it with another model raisesValueError. Language steering is applied by the u3-pro ASR; on the universal-streaming modelslanguage_codedoes not steer, so this matches the parameter's documented behavior and howmode/voice_focusare handled.language_codeis not part ofUpdateConfiguration, so it is not added toupdate_options.modeparam PR (feat(assemblyai): add streaming mode (latency/accuracy preset) param #6156).Test plan
Added unit tests in
tests/test_plugin_assemblyai_stt.py:NOT_GIVENValueErroron a non-u3-rt-pro-family modelupdate_options(connect-time only)ruff checkandruff formatpass on both changed files.