refactor(voice): move mic ownership from clx to otoji#135
Open
snomiao wants to merge 8 commits into
Open
Conversation
Broaden the existing `.claude/settings.local.json` rule to the whole `.claude/` directory (Claude Code stores per-project session caches there), and add `.DS_Store` plus the bin/ artifacts (`clx-prompt`, `clx-prompt-slint`) compiled by build.sh from tracked source. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The Windows adapter's bin name in Cargo.toml is `clx-rust`, so cargo
emits `rs/target/release/clx-rust.exe`. The packaging step expected
`clx.exe`, so the v2.0.0-beta.3 Windows build failed at the
Copy-Item step ("Cannot find path ... clx.exe because it does not
exist") even though the cargo build itself succeeded in 11 minutes.
Source from `clx-rust.exe`, rename to `clx.exe` in the staging dir so
distribution still ships a clean `clx.exe`. The back-commit-to-main
step at line 122 already does this rename in reverse, so its inputs
remain consistent.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR shifts microphone ownership from clx to the external otoji subprocess by switching from stdin-audio piping (otoji listen --plain -) to having otoji open the microphone directly (otoji listen --plain), so macOS mic permission is attributed to otoji.
Changes:
- Update
voice_otojito spawnotoji listen --plainwithout stdin audio piping (remove cpal/VPIO capture + WAV/resample helpers). - Update
voiceto stop computing/passing AEC settings into the otoji backend.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| rs/core/src/modules/voice_otoji.rs | Drops stdin WAV streaming + mic capture threads; launches otoji to open mic itself and reads JSON events from stdout. |
| rs/core/src/modules/voice.rs | Removes AEC gating/pass-through when launching the otoji backend. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
558
to
559
| // VPIO AEC enable for the otoji subprocess mic path. | ||
| // "always" → always on |
Comment on lines
283
to
288
| let mut cmd = Command::new("otoji"); | ||
| let ctx_path = super::voice_ptt::ptt_context_file_path(); | ||
| let mut args: Vec<String> = vec![ | ||
| "listen".into(), "--plain".into(), "-".into(), | ||
| "listen".into(), "--plain".into(), | ||
| // "openai" route goes through OpenAiPolisher which honors the | ||
| // OTOJI_POLISH_BASE_URL / _API_KEY / _MODEL env vars. Default |
`gh release edit` doesn't accept `--generate-notes` (only create does). When the release already exists (re-tag scenario), the fallback path fired with the wrong flag set and crashed with "unknown flag", which also skipped all dependent build jobs because of `needs: create-release`. Use distinct flag sets for the two paths. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
capslockx-windows builds without the `stt` feature (whisper-rs doesn't compile on Windows), so sherpa-rs's build.rs is never invoked and no runtime DLLs are produced. clx.exe runs as a STT-stub in that case; the DLLs would only be needed if `stt` were enabled. Bundle DLLs when present, skip silently when not, instead of throwing "No DLLs found ... sherpa-rs build.rs may have failed". Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Pairs with the workflow fix in 08e1d64. NSIS aborts with "File \"*.dll\" -> no files found" when capslockx-windows builds without the stt feature (the current default — whisper-rs doesn't compile on Windows). Add /nonfatal so the installer build succeeds without DLLs; clx.exe runs as a stt-stub in that configuration. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The header itself flagged it as `# @depreacted use semantic-release`, and release-rust.yml has been the canonical release path since the Rust rewrite. The deprecated workflow kept firing on every `v*` tag and failing because `github-release-from-cc-changelog` couldn't find a matching CHANGELOG entry, polluting the actions tab with red runs. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
clx was capturing audio via cpal/VPIO and piping raw WAV to otoji's stdin. This meant clx held the macOS microphone permission, not otoji. Switch to `otoji listen --plain` (no `-` argument) so otoji opens the mic itself and requests the permission on its own behalf. clx no longer needs mic access — it only reads JSON-line AsrEvents from otoji's stdout. Removes ~240 lines: the otoji-mic thread, cpal capture, VPIO path, write_wav_header, resample_linear, and the aec_enabled parameter from OtojiBackend::start(). AEC can be added to otoji directly when needed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
OtojiBackend::start() gets aec_enabled back so CLX can pass --aec to otoji on macOS when voice.aec_mode = always/dual-only. AEC now runs inside otoji (VoiceProcessingIO) instead of clx. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
77e6a9a to
a940782
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
clxwas capturing audio via cpal/VPIO and piping raw WAV tootoji listen --plain -clxheld the macOS microphone permission and AEC logic, nototojiotoji listen --plain [--aec]so otoji owns both the mic permission AND the AECChanges
otoji (submodule):
src/audio/vpio.rs: new VPIO AudioUnit backend — echo-cancelled mic capture, 30x gain, 48kHz→16kHz resample, emitsAudioChunktoAudioTxsrc/audio/mod.rs: exposevpiomodule on macOSsrc/main.rs:--aecflag onlisten;run_listen_vpiofunction that uses VPIO instead of cpalbuild.rs: linkAudioToolboxframeworkclx:
voice_otoji.rs: restoreaec_enabledparam tostart(); pass--aecto otoji args on macOS whenaec_mode = always/dual-onlyvoice.rs: restoreaec_enabledcomputation at call siteResult
otoji(notclx)clxno longer touches audio at allTest plan
./build.sh)otoji🤖 Generated with Claude Code