Skip to content

Commit a45dc25

Browse files
authored
Merge pull request #233 from Deep-CodeAI/feat/4629-agui-reasoning
feat(#4629): AG-UI streams REASONING events (live model thinking)
2 parents 841c81f + 0e54d7e commit a45dc25

7 files changed

Lines changed: 103 additions & 18 deletions

File tree

CHANGELOG.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,18 @@ All notable changes to Agents.KT are documented here. The format follows [Keep a
44

55
## [Unreleased]
66

7+
### Added — AG-UI now streams REASONING events (live model thinking) (#4629, epic #4523)
8+
9+
`AgUiServer` previously surfaced the lifecycle/text/tool/step event families; it now also bridges
10+
`AgentEvent.Reasoning` (the model's thinking stream, #2406 — Claude/DeepSeek/Ollama) into AG-UI's
11+
**REASONING** family: `REASONING_START``REASONING_MESSAGE_START``REASONING_MESSAGE_CONTENT`*
12+
`REASONING_MESSAGE_END``REASONING_END`, all keyed by one `messageId` (the deprecated `THINKING_*`
13+
names are not used). A frontend (e.g. a CopilotKit chat) can now render live reasoning instead of a
14+
spinner. Reasoning precedes the answer, so `AgUiEventBridge` opens the block on the first reasoning
15+
chunk and closes it before any answer token, tool call, step finish, or run finish — the same ordering
16+
discipline the text state machine already enforces. STATE events and client-tool round-trips remain the
17+
documented AG-UI follow-ups. 2 new tests.
18+
719
### Added — audit-ledger now records cross-cutting agent misbehaviour (#2905, epic #2882)
820

921
`agent.events.ledger(file)` previously chained only tool-action verdicts (`APPROVED` / `DENIED` /

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -209,7 +209,7 @@ These APIs work in `main`, are unit-tested, and are exercised by integration tes
209209
- **Web-grounded search tool (`perplexitySearch`)** — `tools { +perplexitySearchTool(perplexityKey) }` lets an agent reasoning on its *own* model (Claude/OpenAI/Ollama/…) fetch live, cited facts from Perplexity's Sonar API. The tool is `untrustedOutput = true`, so results are auto-wrapped in the `{"trusted":false}` envelope and the model is warned to treat them as data, not instructions (#642) — web search is the canonical prompt-injection vector. The result renders the answer plus a numbered source list parsed from `search_results[]` (citations land in both the model context and the JSONL audit row). Controls via `perplexitySearchOptions { mode = SearchMode.ACADEMIC; recency = SearchRecency.WEEK; allowDomains("arxiv.org"); contextSize = SearchContextSize.HIGH; structuredOutput(MyType::class) }` map to `search_mode` / `search_recency_filter` / `search_domain_filter` / `web_search_options` / `response_format` json_schema (#3674). Key from `.secrets/perplexity-key`. See [docs/providers.md](docs/providers.md#web-grounded-search-tool-perplexitysearch-3676--3677).
210210
- **NLWeb endpoint tool (`nlwebSearch`)**`tools { +nlwebSearchTool(baseUrl = "https://example.com") }` lets an agent query an [NLWeb](https://github.com/nlweb-ai/NLWeb) endpoint — a website's natural-language interface over its **schema.org**-structured content — and fold the ranked, typed results into context (#4541, PRD §12.9). Like `perplexitySearch` it is `untrustedOutput = true` (fetched web content is treated as data, not instructions). `nlwebSearchOptions`-style args via `NlWebSearchOptions(site = "podcasts", mode = NlWebMode.GENERATE)`. NLWeb endpoints need no API key. (Every NLWeb endpoint is also an MCP server, so an NLWeb `/mcp` URL is equally consumable through the existing MCP client — this tool is the zero-wiring `/ask`-over-HTTP path.)
211211
- **Serve an NLWeb endpoint (`NlWebServer`)**`NlWebServer.from(agent).start()` exposes the NLWeb `POST /ask` contract (`{query, site?, mode}` → ranked schema.org `results[]`), so agents.kt is consumable by NLWeb clients — the **serve** side to `nlwebSearch`'s **consume** side (#4542). Same `from(agent)` shape, loopback-only JDK-`HttpServer` posture, and threat model as `McpServer.from(agent)` / `A2AServer.from(agent)` (`127.0.0.1`, optional bearer, front with a gateway). The query is the agent's input; an `NlWebSearchResult` output is served verbatim (ranked schema.org results), any other output becomes the `summary` answer — back the agent's retrieval with the RAG `EmbeddingStore` seam (`:agents-kt-rag`) or whatever you like.
212-
- **Serve a frontend over AG-UI (`AgUiServer`)**`AgUiServer.from(agent).start()` exposes an agent over the [AG-UI](https://github.com/ag-ui-protocol/ag-ui) protocol — the **agent↔user/frontend** layer (e.g. a CopilotKit React chat), the only interop surface that reaches an end-user UI without us building a frontend (#4523). Not a descriptor exporter — a runtime streaming surface: `POST` a `RunAgentInput` and get an **SSE stream of typed AG-UI events**, bridged live from the agent's `AgentSession` (`Token``TEXT_MESSAGE_*`, `ToolCall*``TOOL_CALL_*`, `Skill*``STEP_*`, wrapped in `RUN_STARTED … RUN_FINISHED`). Same `from(agent)` shape, loopback-only posture, and threat model as the others; hand-rolled SSE, no AG-UI SDK. **agents.kt now serves the agentic web four ways: MCP, A2A, NLWeb, and AG-UI.**
212+
- **Serve a frontend over AG-UI (`AgUiServer`)**`AgUiServer.from(agent).start()` exposes an agent over the [AG-UI](https://github.com/ag-ui-protocol/ag-ui) protocol — the **agent↔user/frontend** layer (e.g. a CopilotKit React chat), the only interop surface that reaches an end-user UI without us building a frontend (#4523). Not a descriptor exporter — a runtime streaming surface: `POST` a `RunAgentInput` and get an **SSE stream of typed AG-UI events**, bridged live from the agent's `AgentSession` (`Token``TEXT_MESSAGE_*`, `Reasoning``REASONING_*` (live model thinking), `ToolCall*``TOOL_CALL_*`, `Skill*``STEP_*`, wrapped in `RUN_STARTED … RUN_FINISHED`). Same `from(agent)` shape, loopback-only posture, and threat model as the others; hand-rolled SSE, no AG-UI SDK. **agents.kt now serves the agentic web four ways: MCP, A2A, NLWeb, and AG-UI.**
213213
- **Charge for an agent endpoint over x402 (`X402PaymentGate`, experimental)** — `X402PaymentGate(requirements, facilitator).gate(handler)` wraps any JDK `HttpHandler` so a resource is served only after a settled stablecoin (USDC) payment over the [x402](https://github.com/x402-foundation/x402) protocol (`402 Payment Required`). Pass it straight to the serve surfaces — `NlWebServer.from(agent, payment = gate)` / `AgUiServer.from(agent, payment = gate)` / `A2AServer.from(agent, payment = gate)` — to let an agent **monetize itself** (#4527/#4557; A2A's agent-card discovery stays free). The **safe, seller-side** half: **we hold no key and take no custody** — the buyer signs an EIP-3009 authorization and a *hosted* `FacilitatorClient` verifies + settles on-chain; we only configure a public `payTo`, and the LLM never touches money (gating is at the HTTP layer). **Fails closed** (any failure → `402`, never served unpaid). Buyer-side autonomous payment is deliberately *not* included (it concentrates the irreversible-money risk). New `agents_engine.x402` package, no deps.
214214
- **AGNTCY interop (OASF record + DIR directory + Identity badge)** — `agent.toOasfRecord(version, authors, locators)` exports an [AGNTCY](https://github.com/agntcy) [OASF](https://github.com/agntcy/oasf) 1.0.0 discovery record (the third exporter beside `agent.json` and the A2A AgentCard; skills carry taxonomy uids via the opt-in `.oasf("agent_orchestration/multi_agent_planning")` annotation against a vendored, drift-checked taxonomy), and `fromOasfRecord(json)` imports + fail-closed-validates it back (#4518/#4519, PRD §12.6). The `:agents-kt-dir` module publishes/discovers records in the AGNTCY **DIR** content-addressed directory over generated grpc-kotlin stubs for three services — `StoreService` (CRUD: `dir.push(agent.toOasfRecord(...))` → CID, `dir.pull(cid)`), `SearchService` (local content search by typed `DirQuery` facet — skill/domain/author/…), and `RoutingService` (`publish`/`routeSearch` for cross-peer network discovery) (#4520). The trust side ships in `:agents-kt-identity`: `IdentityVerifier.verify(compactJws, jwks)` validates an AGNTCY Identity **badge** (a JOSE/JWS-secured W3C Verifiable Credential) against an issuer's `/.well-known/jwks.json`, fail-closed via `nimbus-jose-jwt` (rejects `alg: none`, `HS*` algorithm-confusion, expiry, tamper, wrong/unknown key — #4521). Verify-only; issuance deferred. DIR Routing/Search + OCI referrers are follow-ups under epic #4517.
215215
- **Prompt caching across providers**`agent { caching { enabled = true; cacheSystemPrompt = true; cacheToolDefs = true; cacheConversation = Rolling; ttl = 1.hours; cacheable("doc-id") { ... } } }`. Vendor-neutral DSL drives Anthropic's explicit `cache_control` breakpoints (#2658), OpenAI / DeepSeek automatic prefix caching with a stable `prompt_cache_key` routing hint (#2659 / #2661), Ollama / vLLM / SGLang engine-level KV-cache reuse (no-op hints, #2662), and surfaces cache reads + writes + hit-rate on `TokenUsage` (#2663). A prefix-stability guard (#2657) detects silent cache-busters — timestamps, UUIDs, non-deterministic ordering inside cacheable segments — and warns before you pay for a single non-cached run. Off by default; non-breaking. See [docs/caching.md](docs/caching.md).

docs/prd.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2900,7 +2900,7 @@ val hits = dir.search(skill = "agent_orchestration/task_decomposition")
29002900

29012901
Tracking: epic `[interop] AGNTCY support` with subtasks for OASF export, OASF import/validate, DIR client, and Identity verify.
29022902

2903-
### 12.7 AG-UI — Agent↔Frontend Serving *(serve shipped, #4523`AgUiServer.from(agent)`; STATE/REASONING families + client-tool round-trips are follow-ups)*
2903+
### 12.7 AG-UI — Agent↔Frontend Serving *(serve shipped, #4523`AgUiServer.from(agent)`, incl. REASONING #4629; STATE families + client-tool round-trips are follow-ups)*
29042904

29052905
[AG-UI](https://github.com/ag-ui-protocol/ag-ui) (Agent-User Interaction Protocol) is the **agent↔user/frontend** layer of the interop stack. The standard framing — **MCP = agent↔tools, A2A = agent↔agent, AG-UI = agent↔user** — is stated by AG-UI's own docs, which note the three are complementary and often used together by one agent. It's the only interop layer that reaches an **end-user UI** (a streaming React/[CopilotKit](https://copilotkit.ai) chat surface) without us building a frontend.
29062906

@@ -2914,15 +2914,15 @@ Tracking: epic `[interop] AGNTCY support` with subtasks for OASF export, OASF im
29142914
| `TEXT_MESSAGE_START/CONTENT/END` | text token deltas |
29152915
| `TOOL_CALL_START/ARGS/END/RESULT` (streamed partial-JSON args) | tool-call events |
29162916
| `STATE_SNAPSHOT` / `STATE_DELTA` (RFC-6902 JSON Patch) | shared agent↔UI state (new) |
2917-
| `REASONING_*` / `THINKING_*` | reasoning deltas (already separated from text) |
2917+
| `REASONING_*` (shipped #4629) | reasoning deltas (`AgentEvent.Reasoning`, already separated from text) |
29182918

29192919
The whole job is: emit our stream wrapped in the `RUN_STARTED … RUN_FINISHED` envelope over a Micronaut SSE endpoint. Estimated **~1 day**, since we already own the hard part (typed streaming). Frontend/client tools come back as a `ToolMessage` appended to `messages` on the next `POST` (each turn re-posts the full updated history + state).
29202920

29212921
**Build approach — hand-roll, no SDK dependency.** There is **no first-party JVM SDK**; the community Kotlin and Java SDKs in the repo are **client-side only** (they *consume* a remote agent's stream, they don't *serve* one), so neither helps us. Port the event enum as Kotlin sealed/data classes from the language-neutral protobuf source of truth (`sdks/typescript/packages/proto/src/proto/{events,types,patch}.proto`; the TS Zod `events.ts` is canonical and **docs lag the schema** — build against the schema, ~27–34 event types across lifecycle/text/tool/state/reasoning families). Do **not** adopt Atmosphere or AgentScope-Java — they import a rival agent model that fights our runtime.
29222922

29232923
**Why deferred.** Nice-to-have, not must-have, and lower priority than AGNTCY (which reaches agents/directories — our likelier near-term consumer). Two caveats kept on record: (1) **governance** — unlike A2A (Linux Foundation) and MCP (Agentic AI Foundation), AG-UI is still single-vendor (CopilotKit/Tawkit), MIT-licensed (no patent grant), not donated to any foundation as of June 2026; mitigated by the spec being small enough that lock-in barely bites. (2) **A2A/AG-UI streaming overlap** is asserted-but-undefended by sources (both use SSE); our read is A2A streams coarse task updates to a *calling agent* while AG-UI streams fine-grained render events to a *browser* — different consumer and granularity, so they compose. Re-evaluate to must-have if AG-UI is donated to a foundation.
29242924

2925-
Tracking: epic #4523 `[interop] AG-UI support (agent↔frontend serving)`. **Serve side shipped**`AgUiServer.from(agent)` (package `agents_engine.agui`): `RunAgentInput` POST → SSE over the JDK `HttpServer`, `AgUiEventBridge` mapping `AgentSession` events into the `RUN_STARTED … RUN_FINISHED` envelope (lifecycle/text/tool/step families). Hand-rolled, no SDK (as planned). Follow-ups: STATE_SNAPSHOT/STATE_DELTA (needs a shared agent↔UI state model), REASONING/THINKING, and client-tool round-trips.
2925+
Tracking: epic #4523 `[interop] AG-UI support (agent↔frontend serving)`. **Serve side shipped**`AgUiServer.from(agent)` (package `agents_engine.agui`): `RunAgentInput` POST → SSE over the JDK `HttpServer`, `AgUiEventBridge` mapping `AgentSession` events into the `RUN_STARTED … RUN_FINISHED` envelope (lifecycle/text/tool/step families + REASONING #4629`AgentEvent.Reasoning``REASONING_START`/`_MESSAGE_START`/`_MESSAGE_CONTENT`/`_MESSAGE_END`/`_END`). Hand-rolled, no SDK (as planned). Follow-ups: STATE_SNAPSHOT/STATE_DELTA (needs a shared agent↔UI state model) and client-tool round-trips.
29262926

29272927
### 12.8 x402 — Agent Payments / Settlement Layer *(seller-side shipped experimental, #4527`X402PaymentGate`; buyer-side deferred — money-handling)*
29282928

src/main/kotlin/agents_engine/agui/AgUiEventBridge.kt

Lines changed: 39 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -19,49 +19,67 @@ import java.util.UUID
1919
*
2020
* Stateful and single-run: construct one per run. Not thread-safe; [AgUiServer] drives it from one collector.
2121
*
22-
* v1 surfaces the lifecycle/text/tool/step families. STATE events (shared agent↔UI state — no runtime model
23-
* yet), REASONING / THINKING events ([AgentEvent.Reasoning]), and MESSAGES_SNAPSHOT are documented
24-
* follow-ups; the corresponding [AgentEvent]s are simply not surfaced yet.
22+
* Surfaces the lifecycle/text/tool/step families plus REASONING (live model thinking, from
23+
* [AgentEvent.Reasoning]). STATE events (shared agent↔UI state — no runtime model yet) and
24+
* MESSAGES_SNAPSHOT remain documented follow-ups; those [AgentEvent]s are simply not surfaced yet.
25+
*
26+
* **REASONING ordering** (the AG-UI contract): a thinking block is `REASONING_START` →
27+
* `REASONING_MESSAGE_START` → `REASONING_MESSAGE_CONTENT`* → `REASONING_MESSAGE_END` → `REASONING_END`
28+
* (the older `THINKING_*` names are deprecated). Reasoning precedes the answer, so the block is opened on
29+
* the first [AgentEvent.Reasoning] chunk and closed before any answer text, tool call, step finish, or run
30+
* finish — the same discipline the text state machine uses.
2531
*/
2632
internal class AgUiEventBridge(private val threadId: String, private val runId: String) {
2733
private var textMessageId: String? = null
34+
private var reasoningId: String? = null
2835
private var anyText = false
2936

3037
/** The opening `RUN_STARTED`. Emit once before collecting the session. */
3138
fun runStarted(): String = event("RUN_STARTED", "threadId" to threadId, "runId" to runId)
3239

3340
/** Map one [AgentEvent] to zero or more AG-UI event payloads (in order). */
3441
fun onEvent(e: AgentEvent<*>): List<String> = when (e) {
42+
// Reasoning precedes the answer; close any open thinking block before the first answer token.
3543
is AgentEvent.Token -> buildList {
44+
addAll(closeReasoning())
3645
if (textMessageId == null) {
3746
val id = newId().also { textMessageId = it; anyText = true }
3847
add(event("TEXT_MESSAGE_START", "messageId" to id, "role" to "assistant"))
3948
}
4049
add(event("TEXT_MESSAGE_CONTENT", "messageId" to textMessageId!!, "delta" to e.text))
4150
}
4251

52+
is AgentEvent.Reasoning -> buildList {
53+
if (reasoningId == null) {
54+
val id = newId().also { reasoningId = it }
55+
add(event("REASONING_START", "messageId" to id))
56+
add(event("REASONING_MESSAGE_START", "messageId" to id, "role" to "assistant"))
57+
}
58+
add(event("REASONING_MESSAGE_CONTENT", "messageId" to reasoningId!!, "delta" to e.text))
59+
}
60+
4361
is AgentEvent.SkillStarted -> listOf(event("STEP_STARTED", "stepName" to e.skillName))
44-
is AgentEvent.SkillCompleted -> closeText() + event("STEP_FINISHED", "stepName" to e.skillName)
62+
is AgentEvent.SkillCompleted -> closeStreams() + event("STEP_FINISHED", "stepName" to e.skillName)
4563

4664
is AgentEvent.ToolCallStarted ->
47-
closeText() + event("TOOL_CALL_START", "toolCallId" to e.callId, "toolCallName" to e.toolName)
65+
closeStreams() + event("TOOL_CALL_START", "toolCallId" to e.callId, "toolCallName" to e.toolName)
4866
is AgentEvent.ToolCallArgumentsDelta ->
4967
listOf(event("TOOL_CALL_ARGS", "toolCallId" to e.callId, "delta" to e.deltaJson))
5068
is AgentEvent.ToolCallFinished ->
5169
listOf(event("TOOL_CALL_END", "toolCallId" to e.callId))
5270

5371
is AgentEvent.Completed<*> -> finish(e.output)
54-
is AgentEvent.Failed -> closeText() + runError(e.cause.message ?: e.cause.toString())
72+
is AgentEvent.Failed -> closeStreams() + runError(e.cause.message ?: e.cause.toString())
5573

56-
// ModelTurnStarted/Completed, Reasoning, StageStarted/Completed — not surfaced in v1.
74+
// ModelTurnStarted/Completed, StageStarted/Completed — not surfaced in v1.
5775
else -> emptyList()
5876
}
5977

6078
/** A standalone `RUN_ERROR` (used by [AgUiServer] as a backstop if collection throws unexpectedly). */
6179
fun runError(message: String): String = event("RUN_ERROR", "message" to message)
6280

6381
private fun finish(output: Any?): List<String> = buildList {
64-
addAll(closeText())
82+
addAll(closeStreams())
6583
// No tokens streamed (e.g. a deterministic skill) — surface the final output as one message so a UI
6684
// always has something to render.
6785
if (!anyText && output != null) {
@@ -79,6 +97,19 @@ internal class AgUiEventBridge(private val threadId: String, private val runId:
7997
return listOf(event("TEXT_MESSAGE_END", "messageId" to id))
8098
}
8199

100+
private fun closeReasoning(): List<String> {
101+
val id = reasoningId ?: return emptyList()
102+
reasoningId = null
103+
return listOf(
104+
event("REASONING_MESSAGE_END", "messageId" to id),
105+
event("REASONING_END", "messageId" to id),
106+
)
107+
}
108+
109+
/** Close whichever streaming block is open before a boundary (tool call, step finish, run finish). At
110+
* most one of reasoning/text is open at a time, but closing both is order-safe and idempotent. */
111+
private fun closeStreams(): List<String> = closeReasoning() + closeText()
112+
82113
private operator fun List<String>.plus(one: String): List<String> = this + listOf(one)
83114

84115
private fun newId(): String = UUID.randomUUID().toString()

src/main/kotlin/agents_engine/agui/AgUiServer.kt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,8 +30,8 @@ import kotlinx.coroutines.runBlocking
3030
* `NlWebServer`: binds `127.0.0.1`, optional bearer auth, front with a TLS gateway for any network reach.
3131
* Hand-rolled over the JDK [HttpServer] — no AG-UI SDK (the community JVM SDKs are client-side only).
3232
*
33-
* v1 surfaces the lifecycle/text/tool/step event families (see [AgUiEventBridge]); STATE and REASONING events
34-
* and client-tool round-trips (the next `POST` re-sends history) are follow-ups.
33+
* Surfaces the lifecycle/text/tool/step event families plus REASONING (live model thinking — see
34+
* [AgUiEventBridge]); STATE events and client-tool round-trips (the next `POST` re-sends history) are follow-ups.
3535
*/
3636
class AgUiServer private constructor(
3737
private val agent: Agent<*, *>,

0 commit comments

Comments
 (0)