fix(model): default synthetic models to text-only input to avoid image 400s#252
Open
craigamcw wants to merge 1 commit into
Open
fix(model): default synthetic models to text-only input to avoid image 400s#252craigamcw wants to merge 1 commit into
craigamcw wants to merge 1 commit into
Conversation
…e 400s Synthetic models (built for ids not in the pi-ai registry, e.g. deepseek-v4-pro via Ollama) hard-coded `input: ['text', 'image']`, falsely claiming vision support. Because the model advertised image input, the openai-completions provider did not filter image content, so screenshots from the GUI/computer-use tools were sent to text-only endpoints. Ollama rejects these with HTTP 400 "this model does not support image input", surfaced to users as an opaque "invalid message format" error. Default synthetic models to text-only input. Vision-capable models resolved from the pi-ai registry keep their real modalities; only synthetic fallbacks change. For a custom vision endpoint resolved as synthetic this drops images gracefully instead of hard-failing the whole request. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #251.
Text-only models that aren't in pi-ai's registry (e.g.
deepseek-v4-provia Ollama Cloud) hard-fail with a provider 400 whenever the conversation includes an image (screenshots from the GUI/computer-use tools, pasted images). The app surfaces this as an opaque "invalid message format".Root cause:
buildSyntheticPiModelhard-codedinput: ['text', 'image'], so synthetic models falsely advertise vision support. Theopenai-completionsprovider only filters image content when!model.input.includes("image"), so images were sent to text-only endpoints. Ollama rejects them withHTTP 400 "this model does not support image input".This PR defaults synthetic models to
input: ['text']. We can't know whether an arbitrary unknown model supports vision, and a false vision claim hard-fails the entire request, whereas text-only just drops images gracefully. Vision-capable models resolved from the pi-ai registry keep their real modalities — only synthetic fallbacks change.Type of change
fix)Checklist
npm run testpasses locally — see Testing (new test + typecheck pass; full suite not run locally)npm run lintpasses locally — see Testing (no new findings from this change)Testing
Repro and root cause are in #251. Branch is based on
dev.Verified locally:
image_urlcontent todeepseek-v4-proathttps://ollama.com/v1returns400 "this model does not support image input"; text-only requests to the same model return200. With this change, the synthetic model reportsinput: ['text'], soconvertMessagesdrops image content and the request stays text-only.tests/synthetic-model-input.test.ts(2 assertions) — passes vianpx vitest run.npx tsc --noEmit— the change introduces no type errors (one unrelated pre-existing error remains ondev:src/main/config/config-store.ts:369 'getConfigKey' is declared but never read).eslint src/main/claude/pi-model-resolution.ts— no new findings on the changed lines (the file has one pre-existing@typescript-eslint/no-explicit-anyat an unrelated location, left untouched to keep the diff focused).Note: the full
npm run testsuite was not run locally (deps installed with--ignore-scripts, so nativebetter-sqlite3isn't built). CI will run the complete suite.Trade-off
For a custom vision endpoint that happens to be resolved as a synthetic model (not in the pi-ai registry), images would now be filtered out rather than sent. That's a deliberate, conservative default: dropping images degrades gracefully, whereas the current false-positive vision claim hard-fails every request to text-only models. If desired, this could later be made configurable or driven by
KNOWN_MODEL_SPECS.