Merge pull request #233 from Deep-CodeAI/feat/4629-agui-reasoning

Skobeltsyn · web-flow · commit a45dc25a1346 · 2026-06-19T16:50:00.000+03:00
feat(#4629): AG-UI streams REASONING events (live model thinking)
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -4,6 +4,18 @@ All notable changes to Agents.KT are documented here. The format follows [Keep a
 
 ## [Unreleased]
 
+### Added — AG-UI now streams REASONING events (live model thinking) (#4629, epic #4523)
+
+`AgUiServer` previously surfaced the lifecycle/text/tool/step event families; it now also bridges
+`AgentEvent.Reasoning` (the model's thinking stream, #2406 — Claude/DeepSeek/Ollama) into AG-UI's
+**REASONING** family: `REASONING_START` → `REASONING_MESSAGE_START` → `REASONING_MESSAGE_CONTENT`* →
+`REASONING_MESSAGE_END` → `REASONING_END`, all keyed by one `messageId` (the deprecated `THINKING_*`
+names are not used). A frontend (e.g. a CopilotKit chat) can now render live reasoning instead of a
+spinner. Reasoning precedes the answer, so `AgUiEventBridge` opens the block on the first reasoning
+chunk and closes it before any answer token, tool call, step finish, or run finish — the same ordering
+discipline the text state machine already enforces. STATE events and client-tool round-trips remain the
+documented AG-UI follow-ups. 2 new tests.
+
 ### Added — audit-ledger now records cross-cutting agent misbehaviour (#2905, epic #2882)
 
 `agent.events.ledger(file)` previously chained only tool-action verdicts (`APPROVED` / `DENIED` /
diff --git a/README.md b/README.md
@@ -209,7 +209,7 @@ These APIs work in `main`, are unit-tested, and are exercised by integration tes
 - **Web-grounded search tool (`perplexitySearch`)** — `tools { +perplexitySearchTool(perplexityKey) }` lets an agent reasoning on its *own* model (Claude/OpenAI/Ollama/…) fetch live, cited facts from Perplexity's Sonar API. The tool is `untrustedOutput = true`, so results are auto-wrapped in the `{"trusted":false}` envelope and the model is warned to treat them as data, not instructions (#642) — web search is the canonical prompt-injection vector. The result renders the answer plus a numbered source list parsed from `search_results[]` (citations land in both the model context and the JSONL audit row). Controls via `perplexitySearchOptions { mode = SearchMode.ACADEMIC; recency = SearchRecency.WEEK; allowDomains("arxiv.org"); contextSize = SearchContextSize.HIGH; structuredOutput(MyType::class) }` map to `search_mode` / `search_recency_filter` / `search_domain_filter` / `web_search_options` / `response_format` json_schema (#3674). Key from `.secrets/perplexity-key`. See [docs/providers.md](docs/providers.md#web-grounded-search-tool-perplexitysearch-3676--3677).
 - **NLWeb endpoint tool (`nlwebSearch`)** — `tools { +nlwebSearchTool(baseUrl = "https://example.com") }` lets an agent query an [NLWeb](https://github.com/nlweb-ai/NLWeb) endpoint — a website's natural-language interface over its **schema.org**-structured content — and fold the ranked, typed results into context (#4541, PRD §12.9). Like `perplexitySearch` it is `untrustedOutput = true` (fetched web content is treated as data, not instructions). `nlwebSearchOptions`-style args via `NlWebSearchOptions(site = "podcasts", mode = NlWebMode.GENERATE)`. NLWeb endpoints need no API key. (Every NLWeb endpoint is also an MCP server, so an NLWeb `/mcp` URL is equally consumable through the existing MCP client — this tool is the zero-wiring `/ask`-over-HTTP path.)
 - **Serve an NLWeb endpoint (`NlWebServer`)** — `NlWebServer.from(agent).start()` exposes the NLWeb `POST /ask` contract (`{query, site?, mode}` → ranked schema.org `results[]`), so agents.kt is consumable by NLWeb clients — the **serve** side to `nlwebSearch`'s **consume** side (#4542). Same `from(agent)` shape, loopback-only JDK-`HttpServer` posture, and threat model as `McpServer.from(agent)` / `A2AServer.from(agent)` (`127.0.0.1`, optional bearer, front with a gateway). The query is the agent's input; an `NlWebSearchResult` output is served verbatim (ranked schema.org results), any other output becomes the `summary` answer — back the agent's retrieval with the RAG `EmbeddingStore` seam (`:agents-kt-rag`) or whatever you like.
-- **Serve a frontend over AG-UI (`AgUiServer`)** — `AgUiServer.from(agent).start()` exposes an agent over the [AG-UI](https://github.com/ag-ui-protocol/ag-ui) protocol — the **agent↔user/frontend** layer (e.g. a CopilotKit React chat), the only interop surface that reaches an end-user UI without us building a frontend (#4523). Not a descriptor exporter — a runtime streaming surface: `POST` a `RunAgentInput` and get an **SSE stream of typed AG-UI events**, bridged live from the agent's `AgentSession` (`Token` → `TEXT_MESSAGE_*`, `ToolCall*` → `TOOL_CALL_*`, `Skill*` → `STEP_*`, wrapped in `RUN_STARTED … RUN_FINISHED`). Same `from(agent)` shape, loopback-only posture, and threat model as the others; hand-rolled SSE, no AG-UI SDK. **agents.kt now serves the agentic web four ways: MCP, A2A, NLWeb, and AG-UI.**
+- **Serve a frontend over AG-UI (`AgUiServer`)** — `AgUiServer.from(agent).start()` exposes an agent over the [AG-UI](https://github.com/ag-ui-protocol/ag-ui) protocol — the **agent↔user/frontend** layer (e.g. a CopilotKit React chat), the only interop surface that reaches an end-user UI without us building a frontend (#4523). Not a descriptor exporter — a runtime streaming surface: `POST` a `RunAgentInput` and get an **SSE stream of typed AG-UI events**, bridged live from the agent's `AgentSession` (`Token` → `TEXT_MESSAGE_*`, `Reasoning` → `REASONING_*` (live model thinking), `ToolCall*` → `TOOL_CALL_*`, `Skill*` → `STEP_*`, wrapped in `RUN_STARTED … RUN_FINISHED`). Same `from(agent)` shape, loopback-only posture, and threat model as the others; hand-rolled SSE, no AG-UI SDK. **agents.kt now serves the agentic web four ways: MCP, A2A, NLWeb, and AG-UI.**
 - **Charge for an agent endpoint over x402 (`X402PaymentGate`, experimental)** — `X402PaymentGate(requirements, facilitator).gate(handler)` wraps any JDK `HttpHandler` so a resource is served only after a settled stablecoin (USDC) payment over the [x402](https://github.com/x402-foundation/x402) protocol (`402 Payment Required`). Pass it straight to the serve surfaces — `NlWebServer.from(agent, payment = gate)` / `AgUiServer.from(agent, payment = gate)` / `A2AServer.from(agent, payment = gate)` — to let an agent **monetize itself** (#4527/#4557; A2A's agent-card discovery stays free). The **safe, seller-side** half: **we hold no key and take no custody** — the buyer signs an EIP-3009 authorization and a *hosted* `FacilitatorClient` verifies + settles on-chain; we only configure a public `payTo`, and the LLM never touches money (gating is at the HTTP layer). **Fails closed** (any failure → `402`, never served unpaid). Buyer-side autonomous payment is deliberately *not* included (it concentrates the irreversible-money risk). New `agents_engine.x402` package, no deps.
 - **AGNTCY interop (OASF record + DIR directory + Identity badge)** — `agent.toOasfRecord(version, authors, locators)` exports an [AGNTCY](https://github.com/agntcy) [OASF](https://github.com/agntcy/oasf) 1.0.0 discovery record (the third exporter beside `agent.json` and the A2A AgentCard; skills carry taxonomy uids via the opt-in `.oasf("agent_orchestration/multi_agent_planning")` annotation against a vendored, drift-checked taxonomy), and `fromOasfRecord(json)` imports + fail-closed-validates it back (#4518/#4519, PRD §12.6). The `:agents-kt-dir` module publishes/discovers records in the AGNTCY **DIR** content-addressed directory over generated grpc-kotlin stubs for three services — `StoreService` (CRUD: `dir.push(agent.toOasfRecord(...))` → CID, `dir.pull(cid)`), `SearchService` (local content search by typed `DirQuery` facet — skill/domain/author/…), and `RoutingService` (`publish`/`routeSearch` for cross-peer network discovery) (#4520). The trust side ships in `:agents-kt-identity`: `IdentityVerifier.verify(compactJws, jwks)` validates an AGNTCY Identity **badge** (a JOSE/JWS-secured W3C Verifiable Credential) against an issuer's `/.well-known/jwks.json`, fail-closed via `nimbus-jose-jwt` (rejects `alg: none`, `HS*` algorithm-confusion, expiry, tamper, wrong/unknown key — #4521). Verify-only; issuance deferred. DIR Routing/Search + OCI referrers are follow-ups under epic #4517.
 - **Prompt caching across providers** — `agent { caching { enabled = true; cacheSystemPrompt = true; cacheToolDefs = true; cacheConversation = Rolling; ttl = 1.hours; cacheable("doc-id") { ... } } }`. Vendor-neutral DSL drives Anthropic's explicit `cache_control` breakpoints (#2658), OpenAI / DeepSeek automatic prefix caching with a stable `prompt_cache_key` routing hint (#2659 / #2661), Ollama / vLLM / SGLang engine-level KV-cache reuse (no-op hints, #2662), and surfaces cache reads + writes + hit-rate on `TokenUsage` (#2663). A prefix-stability guard (#2657) detects silent cache-busters — timestamps, UUIDs, non-deterministic ordering inside cacheable segments — and warns before you pay for a single non-cached run. Off by default; non-breaking. See [docs/caching.md](docs/caching.md).
diff --git a/docs/prd.md b/docs/prd.md
@@ -2900,7 +2900,7 @@ val hits = dir.search(skill = "agent_orchestration/task_decomposition")
 
 Tracking: epic `[interop] AGNTCY support` with subtasks for OASF export, OASF import/validate, DIR client, and Identity verify.
 
-### 12.7 AG-UI — Agent↔Frontend Serving *(serve shipped, #4523 — `AgUiServer.from(agent)`; STATE/REASONING families + client-tool round-trips are follow-ups)*
+### 12.7 AG-UI — Agent↔Frontend Serving *(serve shipped, #4523 — `AgUiServer.from(agent)`, incl. REASONING #4629; STATE families + client-tool round-trips are follow-ups)*
 
 [AG-UI](https://github.com/ag-ui-protocol/ag-ui) (Agent-User Interaction Protocol) is the **agent↔user/frontend** layer of the interop stack. The standard framing — **MCP = agent↔tools, A2A = agent↔agent, AG-UI = agent↔user** — is stated by AG-UI's own docs, which note the three are complementary and often used together by one agent. It's the only interop layer that reaches an **end-user UI** (a streaming React/[CopilotKit](https://copilotkit.ai) chat surface) without us building a frontend.
 
@@ -2914,15 +2914,15 @@ Tracking: epic `[interop] AGNTCY support` with subtasks for OASF export, OASF im
 | `TEXT_MESSAGE_START/CONTENT/END` | text token deltas |
 | `TOOL_CALL_START/ARGS/END/RESULT` (streamed partial-JSON args) | tool-call events |
 | `STATE_SNAPSHOT` / `STATE_DELTA` (RFC-6902 JSON Patch) | shared agent↔UI state (new) |
-| `REASONING_*` / `THINKING_*` | reasoning deltas (already separated from text) |
+| `REASONING_*` (shipped #4629) | reasoning deltas (`AgentEvent.Reasoning`, already separated from text) |
 
 The whole job is: emit our stream wrapped in the `RUN_STARTED … RUN_FINISHED` envelope over a Micronaut SSE endpoint. Estimated **~1 day**, since we already own the hard part (typed streaming). Frontend/client tools come back as a `ToolMessage` appended to `messages` on the next `POST` (each turn re-posts the full updated history + state).
 
 **Build approach — hand-roll, no SDK dependency.** There is **no first-party JVM SDK**; the community Kotlin and Java SDKs in the repo are **client-side only** (they *consume* a remote agent's stream, they don't *serve* one), so neither helps us. Port the event enum as Kotlin sealed/data classes from the language-neutral protobuf source of truth (`sdks/typescript/packages/proto/src/proto/{events,types,patch}.proto`; the TS Zod `events.ts` is canonical and **docs lag the schema** — build against the schema, ~27–34 event types across lifecycle/text/tool/state/reasoning families). Do **not** adopt Atmosphere or AgentScope-Java — they import a rival agent model that fights our runtime.
 
 **Why deferred.** Nice-to-have, not must-have, and lower priority than AGNTCY (which reaches agents/directories — our likelier near-term consumer). Two caveats kept on record: (1) **governance** — unlike A2A (Linux Foundation) and MCP (Agentic AI Foundation), AG-UI is still single-vendor (CopilotKit/Tawkit), MIT-licensed (no patent grant), not donated to any foundation as of June 2026; mitigated by the spec being small enough that lock-in barely bites. (2) **A2A/AG-UI streaming overlap** is asserted-but-undefended by sources (both use SSE); our read is A2A streams coarse task updates to a *calling agent* while AG-UI streams fine-grained render events to a *browser* — different consumer and granularity, so they compose. Re-evaluate to must-have if AG-UI is donated to a foundation.
 
-Tracking: epic #4523 `[interop] AG-UI support (agent↔frontend serving)`. **Serve side shipped** — `AgUiServer.from(agent)` (package `agents_engine.agui`): `RunAgentInput` POST → SSE over the JDK `HttpServer`, `AgUiEventBridge` mapping `AgentSession` events into the `RUN_STARTED … RUN_FINISHED` envelope (lifecycle/text/tool/step families). Hand-rolled, no SDK (as planned). Follow-ups: STATE_SNAPSHOT/STATE_DELTA (needs a shared agent↔UI state model), REASONING/THINKING, and client-tool round-trips.
+Tracking: epic #4523 `[interop] AG-UI support (agent↔frontend serving)`. **Serve side shipped** — `AgUiServer.from(agent)` (package `agents_engine.agui`): `RunAgentInput` POST → SSE over the JDK `HttpServer`, `AgUiEventBridge` mapping `AgentSession` events into the `RUN_STARTED … RUN_FINISHED` envelope (lifecycle/text/tool/step families + REASONING #4629 — `AgentEvent.Reasoning` → `REASONING_START`/`_MESSAGE_START`/`_MESSAGE_CONTENT`/`_MESSAGE_END`/`_END`). Hand-rolled, no SDK (as planned). Follow-ups: STATE_SNAPSHOT/STATE_DELTA (needs a shared agent↔UI state model) and client-tool round-trips.
 
 ### 12.8 x402 — Agent Payments / Settlement Layer *(seller-side shipped experimental, #4527 — `X402PaymentGate`; buyer-side deferred — money-handling)*
 
diff --git a/src/main/kotlin/agents_engine/agui/AgUiEventBridge.kt b/src/main/kotlin/agents_engine/agui/AgUiEventBridge.kt
@@ -19,49 +19,67 @@ import java.util.UUID
  *
  * Stateful and single-run: construct one per run. Not thread-safe; [AgUiServer] drives it from one collector.
  *
- * v1 surfaces the lifecycle/text/tool/step families. STATE events (shared agent↔UI state — no runtime model
- * yet), REASONING / THINKING events ([AgentEvent.Reasoning]), and MESSAGES_SNAPSHOT are documented
- * follow-ups; the corresponding [AgentEvent]s are simply not surfaced yet.
+ * Surfaces the lifecycle/text/tool/step families plus REASONING (live model thinking, from
+ * [AgentEvent.Reasoning]). STATE events (shared agent↔UI state — no runtime model yet) and
+ * MESSAGES_SNAPSHOT remain documented follow-ups; those [AgentEvent]s are simply not surfaced yet.
+ *
+ * **REASONING ordering** (the AG-UI contract): a thinking block is `REASONING_START` →
+ * `REASONING_MESSAGE_START` → `REASONING_MESSAGE_CONTENT`* → `REASONING_MESSAGE_END` → `REASONING_END`
+ * (the older `THINKING_*` names are deprecated). Reasoning precedes the answer, so the block is opened on
+ * the first [AgentEvent.Reasoning] chunk and closed before any answer text, tool call, step finish, or run
+ * finish — the same discipline the text state machine uses.
  */
 internal class AgUiEventBridge(private val threadId: String, private val runId: String) {
     private var textMessageId: String? = null
+    private var reasoningId: String? = null
     private var anyText = false
 
     /** The opening `RUN_STARTED`. Emit once before collecting the session. */
     fun runStarted(): String = event("RUN_STARTED", "threadId" to threadId, "runId" to runId)
 
     /** Map one [AgentEvent] to zero or more AG-UI event payloads (in order). */
     fun onEvent(e: AgentEvent<*>): List<String> = when (e) {
+        // Reasoning precedes the answer; close any open thinking block before the first answer token.
         is AgentEvent.Token -> buildList {
+            addAll(closeReasoning())
             if (textMessageId == null) {
                 val id = newId().also { textMessageId = it; anyText = true }
                 add(event("TEXT_MESSAGE_START", "messageId" to id, "role" to "assistant"))
             }
             add(event("TEXT_MESSAGE_CONTENT", "messageId" to textMessageId!!, "delta" to e.text))
         }
 
+        is AgentEvent.Reasoning -> buildList {
+            if (reasoningId == null) {
+                val id = newId().also { reasoningId = it }
+                add(event("REASONING_START", "messageId" to id))
+                add(event("REASONING_MESSAGE_START", "messageId" to id, "role" to "assistant"))
+            }
+            add(event("REASONING_MESSAGE_CONTENT", "messageId" to reasoningId!!, "delta" to e.text))
+        }
+
         is AgentEvent.SkillStarted -> listOf(event("STEP_STARTED", "stepName" to e.skillName))
-        is AgentEvent.SkillCompleted -> closeText() + event("STEP_FINISHED", "stepName" to e.skillName)
+        is AgentEvent.SkillCompleted -> closeStreams() + event("STEP_FINISHED", "stepName" to e.skillName)
 
         is AgentEvent.ToolCallStarted ->
-            closeText() + event("TOOL_CALL_START", "toolCallId" to e.callId, "toolCallName" to e.toolName)
+            closeStreams() + event("TOOL_CALL_START", "toolCallId" to e.callId, "toolCallName" to e.toolName)
         is AgentEvent.ToolCallArgumentsDelta ->
             listOf(event("TOOL_CALL_ARGS", "toolCallId" to e.callId, "delta" to e.deltaJson))
         is AgentEvent.ToolCallFinished ->
             listOf(event("TOOL_CALL_END", "toolCallId" to e.callId))
 
         is AgentEvent.Completed<*> -> finish(e.output)
-        is AgentEvent.Failed -> closeText() + runError(e.cause.message ?: e.cause.toString())
+        is AgentEvent.Failed -> closeStreams() + runError(e.cause.message ?: e.cause.toString())
 
-        // ModelTurnStarted/Completed, Reasoning, StageStarted/Completed — not surfaced in v1.
+        // ModelTurnStarted/Completed, StageStarted/Completed — not surfaced in v1.
         else -> emptyList()
     }
 
     /** A standalone `RUN_ERROR` (used by [AgUiServer] as a backstop if collection throws unexpectedly). */
     fun runError(message: String): String = event("RUN_ERROR", "message" to message)
 
     private fun finish(output: Any?): List<String> = buildList {
-        addAll(closeText())
+        addAll(closeStreams())
         // No tokens streamed (e.g. a deterministic skill) — surface the final output as one message so a UI
         // always has something to render.
         if (!anyText && output != null) {
@@ -79,6 +97,19 @@ internal class AgUiEventBridge(private val threadId: String, private val runId:
         return listOf(event("TEXT_MESSAGE_END", "messageId" to id))
     }
 
+    private fun closeReasoning(): List<String> {
+        val id = reasoningId ?: return emptyList()
+        reasoningId = null
+        return listOf(
+            event("REASONING_MESSAGE_END", "messageId" to id),
+            event("REASONING_END", "messageId" to id),
+        )
+    }
+
+    /** Close whichever streaming block is open before a boundary (tool call, step finish, run finish). At
+     * most one of reasoning/text is open at a time, but closing both is order-safe and idempotent. */
+    private fun closeStreams(): List<String> = closeReasoning() + closeText()
+
     private operator fun List<String>.plus(one: String): List<String> = this + listOf(one)
 
     private fun newId(): String = UUID.randomUUID().toString()
diff --git a/src/main/kotlin/agents_engine/agui/AgUiServer.kt b/src/main/kotlin/agents_engine/agui/AgUiServer.kt
@@ -30,8 +30,8 @@ import kotlinx.coroutines.runBlocking
  * `NlWebServer`: binds `127.0.0.1`, optional bearer auth, front with a TLS gateway for any network reach.
  * Hand-rolled over the JDK [HttpServer] — no AG-UI SDK (the community JVM SDKs are client-side only).
  *
- * v1 surfaces the lifecycle/text/tool/step event families (see [AgUiEventBridge]); STATE and REASONING events
- * and client-tool round-trips (the next `POST` re-sends history) are follow-ups.
+ * Surfaces the lifecycle/text/tool/step event families plus REASONING (live model thinking — see
+ * [AgUiEventBridge]); STATE events and client-tool round-trips (the next `POST` re-sends history) are follow-ups.
  */
 class AgUiServer private constructor(
     private val agent: Agent<*, *>,
diff --git a/src/main/resources/internals-agent/agui/AgUiServer.md b/src/main/resources/internals-agent/agui/AgUiServer.md
diff --git a/src/test/kotlin/agents_engine/agui/AgUiServerTest.kt b/src/test/kotlin/agents_engine/agui/AgUiServerTest.kt