fix(model): default synthetic models to text-only input to avoid image 400s by craigamcw · Pull Request #252 · OpenCoworkAI/open-cowork

craigamcw · 2026-06-16T08:01:50Z

Summary

Fixes #251.

Text-only models that aren't in pi-ai's registry (e.g. deepseek-v4-pro via Ollama Cloud) hard-fail with a provider 400 whenever the conversation includes an image (screenshots from the GUI/computer-use tools, pasted images). The app surfaces this as an opaque "invalid message format".

Root cause: buildSyntheticPiModel hard-coded input: ['text', 'image'], so synthetic models falsely advertise vision support. The openai-completions provider only filters image content when !model.input.includes("image"), so images were sent to text-only endpoints. Ollama rejects them with HTTP 400 "this model does not support image input".

This PR defaults synthetic models to input: ['text']. We can't know whether an arbitrary unknown model supports vision, and a false vision claim hard-fails the entire request, whereas text-only just drops images gracefully. Vision-capable models resolved from the pi-ai registry keep their real modalities — only synthetic fallbacks change.

Type of change

Bug fix (fix)

Checklist

Code follows the project style (TypeScript strict, ESLint, Prettier) — changed lines only
Commit messages follow Conventional Commits
Self-review completed — no debug logs, no commented-out code
Tests added or updated for the changed behaviour
npm run test passes locally — see Testing (new test + typecheck pass; full suite not run locally)
npm run lint passes locally — see Testing (no new findings from this change)
UI changes tested on both macOS and Windows — N/A, no UI; provider-agnostic model-resolution logic
New user-facing strings added to i18n files — N/A

Testing

Repro and root cause are in #251. Branch is based on dev.

Verified locally:

Direct API check: posting image_url content to deepseek-v4-pro at https://ollama.com/v1 returns 400 "this model does not support image input"; text-only requests to the same model return 200. With this change, the synthetic model reports input: ['text'], so convertMessages drops image content and the request stays text-only.
New test tests/synthetic-model-input.test.ts (2 assertions) — passes via npx vitest run.
npx tsc --noEmit — the change introduces no type errors (one unrelated pre-existing error remains on dev: src/main/config/config-store.ts:369 'getConfigKey' is declared but never read).
eslint src/main/claude/pi-model-resolution.ts — no new findings on the changed lines (the file has one pre-existing @typescript-eslint/no-explicit-any at an unrelated location, left untouched to keep the diff focused).

Note: the full npm run test suite was not run locally (deps installed with --ignore-scripts, so native better-sqlite3 isn't built). CI will run the complete suite.

Trade-off

For a custom vision endpoint that happens to be resolved as a synthetic model (not in the pi-ai registry), images would now be filtered out rather than sent. That's a deliberate, conservative default: dropping images degrades gracefully, whereas the current false-positive vision claim hard-fails every request to text-only models. If desired, this could later be made configurable or driven by KNOWN_MODEL_SPECS.

…e 400s Synthetic models (built for ids not in the pi-ai registry, e.g. deepseek-v4-pro via Ollama) hard-coded `input: ['text', 'image']`, falsely claiming vision support. Because the model advertised image input, the openai-completions provider did not filter image content, so screenshots from the GUI/computer-use tools were sent to text-only endpoints. Ollama rejects these with HTTP 400 "this model does not support image input", surfaced to users as an opaque "invalid message format" error. Default synthetic models to text-only input. Vision-capable models resolved from the pi-ai registry keep their real modalities; only synthetic fallbacks change. For a custom vision endpoint resolved as synthetic this drops images gracefully instead of hard-failing the whole request. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(model): default synthetic models to text-only input to avoid image 400s#252

fix(model): default synthetic models to text-only input to avoid image 400s#252
craigamcw wants to merge 1 commit into
OpenCoworkAI:devfrom
craigamcw:fix/synthetic-model-text-only-input

craigamcw commented Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

craigamcw commented Jun 16, 2026

Summary

Type of change

Checklist

Testing

Trade-off

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant