Skip to content

fix(core): enforce input-model strictness and required params in ToolCatalog [TOO-771]#829

Open
jottakka wants to merge 5 commits into
mainfrom
claude/modest-newton-b1ed2e
Open

fix(core): enforce input-model strictness and required params in ToolCatalog [TOO-771]#829
jottakka wants to merge 5 commits into
mainfrom
claude/modest-newton-b1ed2e

Conversation

@jottakka
Copy link
Copy Markdown
Contributor

@jottakka jottakka commented Apr 23, 2026

Summary

Closes TOO-771 (https://linear.app/arcadedev/issue/TOO-771)

Auto-generated Pydantic input models for @tool-decorated functions currently:

  1. Silently drop unknown kwargs (no extra='forbid'), so edit_requests=... instead of requests=... produces isError: false with zero indication anything was wrong.
  2. Default omitted required params to None (every field is effectively optional at validation time, because create_func_models passes default=None unconditionally), so the wire schema's required=True never turns into a Pydantic error.
  3. Silently drop typos inside nested TypedDict request dicts ({"txt": "x"} instead of {"text": "x"}), because the raw TypedDict → Pydantic path ignores extras.

Reported by francisco@arcade.dev after ~1200 Google Docs edit requests were discarded across a multi-hour session; the agent had no signal that anything was being ignored. Production reproductions against GoogleDocs.InsertTextAtEndOfDocument@5.2.0 and GoogleDocs.EditDocument@5.2.0 confirm the same silent drops at the deployed Arcade Cloud MCP.

Changes — libs/arcade-core/arcade_core/catalog.py

  • Top-level strictness: create_func_models now sets ConfigDict(extra='forbid') on the generated input model. Unknown kwargs raise ValidationError, surfaced as ToolInputError / ErrorKind.TOOL_RUNTIME_BAD_INPUT_VALUE by the existing ToolExecutor._serialize_input branch.
  • Real required fields: Required params are emitted without default= so Pydantic enforces them at validation time. Adds a ParamInfo.has_explicit_default flag to distinguish "no default" from "default=None", preventing regressions on def f(x: str = None) style signatures (which are now correctly typed as Optional[str] with default=None).
  • Nested TypedDict strictness (input only): New create_model_from_typeddict(..., strict=True) path wraps input-side TypedDicts in a _StrictTypedDictBaseModel(extra='forbid'). Output-side TypedDicts keep the permissive default so tools returning dicts with extra keys from upstream APIs (e.g. Google, Slack) don't break.
  • @model_serializer(mode='wrap') on _TypedDictBaseModel drops unset fields during nested serialization. The previous model_dump() Python-level override was bypassed by Pydantic v2's Rust-level outer serializer.
  • Edge cases handled: Optional[TypedDict], list[TypedDict], list[Optional[TypedDict]], list[TypedDict] inside a parent TypedDict, two params sharing a TypedDict type (param-name-qualified model names to avoid $defs collisions), and bare list (no type arg) destructuring.

Version bumps

arcade-core 4.7.0 → 4.8.0 (minor: validation semantics change).
arcade-tdk and arcade-mcp-server both pin arcade-core>=4.8.0,<5.0.0.

Review history

This PR went through 3 rounds of local ACR review; round 3 findings about theoretical model_serializer kwarg forwarding (1/5) and hypothetical non-None TypedDict field defaults (1/5) are documented in-source and not addressed since the actual call paths are covered by tests.

Test plan

  • 22 new tests in libs/tests/core/test_input_model_strictness.py covering all three bug shapes + edge cases + regressions.
  • Full test suite: 2698 pass, 1 skip, 0 regressions.
  • make check clean on arcade-core (pre-commit, ruff, mypy).
  • End-to-end verified: ToolExecutor.run now returns ErrorKind.TOOL_RUNTIME_BAD_INPUT_VALUE for both unknown kwargs and missing required fields (previously both silently succeeded).
  • CI green across Python 3.10–3.14 / Ubuntu / Windows / macOS.

🤖 Generated with Claude Code


Note

Medium Risk
Changes runtime validation semantics for all tool calls, so previously tolerated extra/missing fields will now raise and may break existing clients; behavior is well-covered by new tests but affects a core execution path.

Overview
Auto-generated Pydantic input models for @tool functions are now strict: unknown top-level kwargs are rejected (extra="forbid"), and parameters without an explicit default are truly required at validation time (no more implicit default=None). A new has_explicit_default flag distinguishes “no default” vs = None so signatures like def f(x: str = None) (and Field(default=None) / default_factory) remain optional and are re-annotated as Optional[...] when needed.

Nested TypedDict inputs are also validated strictly by wrapping them (including list[TypedDict] and related edge cases) in generated Pydantic models that forbid extra keys, while TypedDict outputs remain permissive; serialization of total=False TypedDict models is fixed via @model_serializer to drop unset fields even when nested. Adds a comprehensive test suite for these failure modes, and bumps package versions/pins (arcade-core 4.8.0 and downstream deps) to reflect the behavior change.

Reviewed by Cursor Bugbot for commit 737a177. Bugbot is set up for automated code reviews on this repo. Configure here.

…Catalog

Auto-generated Pydantic input models for @tool-decorated functions were
silently dropping unknown kwargs and defaulting omitted required params to
None, so typos in tool calls produced isError: false with no indication the
arguments were ignored. Also fixes nested TypedDict typos being silently
dropped inside request dicts.

Changes in libs/arcade-core/arcade_core/catalog.py:
- Top-level input model now sets ConfigDict(extra='forbid') so unknown
  kwargs raise ValidationError (surfaced as ToolInputError by ToolExecutor).
- Required params (no Optional[...] annotation, no explicit default in the
  signature) are now emitted without default=, so Pydantic enforces them at
  validation time. Adds a ParamInfo.has_explicit_default flag to
  distinguish "no default" from "default=None", preventing regressions on
  def f(x: str = None) style signatures.
- Nested TypedDict input fields are routed through a new strict Pydantic
  wrapper (create_model_from_typeddict(..., strict=True)) so typos inside
  request dicts also raise. Output-side TypedDicts remain permissive to
  preserve pass-through of extra keys from upstream APIs.
- @model_serializer(mode='wrap') on _TypedDictBaseModel drops unset fields
  during nested serialization (Pydantic v2's outer serializer does not
  invoke the Python-level model_dump override).
- Wraps Optional[TypedDict] inside lists and inside parent TypedDict fields
  so strictness propagates. Param-name-qualified model names prevent $defs
  collisions when two params share a TypedDict type.

Version bumps: arcade-core 4.7.0 → 4.8.0; arcade-tdk and arcade-mcp-server
pin arcade-core>=4.8.0 since tool-call validation semantics change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 23, 2026

Codecov Report

❌ Patch coverage is 98.07692% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
libs/arcade-core/arcade_core/catalog.py 98.07% 1 Missing ⚠️
Files with missing lines Coverage Δ
libs/arcade-core/arcade_core/catalog.py 91.10% <98.07%> (+0.30%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

jottakka and others added 3 commits April 23, 2026 21:38
Libraries whose pyproject.toml metadata changes in this PR get a version
bump so the behavioral contract tightening (arcade-core>=4.8.0 required
for strict input validation) is addressable by consumers via their lockfiles:

- arcade-tdk 3.8.0 → 3.9.0
- arcade-mcp-server 1.20.0 → 1.21.0 (also retargets arcade-tdk>=3.9.0,
  arcade-serve>=3.3.0)
- arcade-serve 3.2.3 → 3.3.0 (retargets arcade-core>=4.8.0 — serve calls
  ToolExecutor.run with the new input_model semantics via BaseWorker)
- arcade-mcp (root) 1.14.0 → 1.15.0 (retargets arcade-core>=4.8.0,
  arcade-mcp-server>=1.21.0)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…models

extract_field_info raises ToolInputSchemaError when a parameter lacks an
Annotated description, so by the time tool_field_info reaches
create_func_models its description is guaranteed non-None. The
`or "No description provided."` branch has been unreachable on main and
is now removed.

Output-side fallbacks in determine_output_model are retained — return
annotations may legitimately omit descriptions (e.g. `-> str` without
Annotated).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…default

The has_explicit_default flag is populated by two code paths — one for
plain Python signatures (extract_python_param_info) and one for
pydantic.Field(...) declarations (extract_pydantic_param_info). The
Python-signature path was covered by existing tests; this adds four
tests for the Pydantic-Field path:

- Field() with no default/factory → has_explicit_default=False, required
- Field(default=None) → has_explicit_default=True, optional, nullable
- Field(default="x") → has_explicit_default=True, optional, non-null
- Field(default_factory=list) → has_explicit_default=True, optional

Follows up on fresh-context reviewer's should-fix #3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jottakka jottakka self-assigned this Apr 24, 2026
@jottakka jottakka requested a review from EricGustin April 24, 2026 17:33
@jottakka jottakka marked this pull request as ready for review April 24, 2026 17:33
Resolve version-bump conflicts in pyproject.toml files. Bumped
arcade-mcp-server to 1.21.2 to layer the PR's dep update on top of
main's 1.21.1 from #826.
@jottakka jottakka removed their assignment May 4, 2026
@jottakka jottakka changed the title fix(core): enforce input-model strictness and required params in ToolCatalog fix(core): enforce input-model strictness and required params in ToolCatalog [TOO-771] May 5, 2026
@jottakka jottakka self-assigned this May 5, 2026
@github-actions
Copy link
Copy Markdown

This pull request has been automatically marked as stale because it has had no activity for 14 days. It will be closed in 14 days if no further activity occurs. If this is still relevant, please leave a comment or remove the stale label.

@github-actions github-actions Bot added stale and removed stale labels May 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant