You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -209,7 +209,7 @@ These APIs work in `main`, are unit-tested, and are exercised by integration tes
209
209
- **Web-grounded search tool (`perplexitySearch`)** — `tools { +perplexitySearchTool(perplexityKey) }` lets an agent reasoning on its *own* model (Claude/OpenAI/Ollama/…) fetch live, cited facts from Perplexity's Sonar API. The tool is `untrustedOutput = true`, so results are auto-wrapped in the `{"trusted":false}` envelope and the model is warned to treat them as data, not instructions (#642) — web search is the canonical prompt-injection vector. The result renders the answer plus a numbered source list parsed from `search_results[]` (citations land in both the model context and the JSONL audit row). Controls via `perplexitySearchOptions { mode = SearchMode.ACADEMIC; recency = SearchRecency.WEEK; allowDomains("arxiv.org"); contextSize = SearchContextSize.HIGH; structuredOutput(MyType::class) }` map to `search_mode` / `search_recency_filter` / `search_domain_filter` / `web_search_options` / `response_format` json_schema (#3674). Key from `.secrets/perplexity-key`. See [docs/providers.md](docs/providers.md#web-grounded-search-tool-perplexitysearch-3676--3677).
210
210
-**NLWeb endpoint tool (`nlwebSearch`)** — `tools { +nlwebSearchTool(baseUrl = "https://example.com") }` lets an agent query an [NLWeb](https://github.com/nlweb-ai/NLWeb) endpoint — a website's natural-language interface over its **schema.org**-structured content — and fold the ranked, typed results into context (#4541, PRD §12.9). Like `perplexitySearch` it is `untrustedOutput = true` (fetched web content is treated as data, not instructions). `nlwebSearchOptions`-style args via `NlWebSearchOptions(site = "podcasts", mode = NlWebMode.GENERATE)`. NLWeb endpoints need no API key. (Every NLWeb endpoint is also an MCP server, so an NLWeb `/mcp` URL is equally consumable through the existing MCP client — this tool is the zero-wiring `/ask`-over-HTTP path.)
211
211
-**Serve an NLWeb endpoint (`NlWebServer`)** — `NlWebServer.from(agent).start()` exposes the NLWeb `POST /ask` contract (`{query, site?, mode}` → ranked schema.org `results[]`), so agents.kt is consumable by NLWeb clients — the **serve** side to `nlwebSearch`'s **consume** side (#4542). Same `from(agent)` shape, loopback-only JDK-`HttpServer` posture, and threat model as `McpServer.from(agent)` / `A2AServer.from(agent)` (`127.0.0.1`, optional bearer, front with a gateway). The query is the agent's input; an `NlWebSearchResult` output is served verbatim (ranked schema.org results), any other output becomes the `summary` answer — back the agent's retrieval with the RAG `EmbeddingStore` seam (`:agents-kt-rag`) or whatever you like.
212
-
-**Serve a frontend over AG-UI (`AgUiServer`)** — `AgUiServer.from(agent).start()` exposes an agent over the [AG-UI](https://github.com/ag-ui-protocol/ag-ui) protocol — the **agent↔user/frontend** layer (e.g. a CopilotKit React chat), the only interop surface that reaches an end-user UI without us building a frontend (#4523). Not a descriptor exporter — a runtime streaming surface: `POST` a `RunAgentInput` and get an **SSE stream of typed AG-UI events**, bridged live from the agent's `AgentSession` (`Token` → `TEXT_MESSAGE_*`, `ToolCall*` → `TOOL_CALL_*`, `Skill*` → `STEP_*`, wrapped in `RUN_STARTED … RUN_FINISHED`). Same `from(agent)` shape, loopback-only posture, and threat model as the others; hand-rolled SSE, no AG-UI SDK. **agents.kt now serves the agentic web four ways: MCP, A2A, NLWeb, and AG-UI.**
212
+
-**Serve a frontend over AG-UI (`AgUiServer`)** — `AgUiServer.from(agent).start()` exposes an agent over the [AG-UI](https://github.com/ag-ui-protocol/ag-ui) protocol — the **agent↔user/frontend** layer (e.g. a CopilotKit React chat), the only interop surface that reaches an end-user UI without us building a frontend (#4523). Not a descriptor exporter — a runtime streaming surface: `POST` a `RunAgentInput` and get an **SSE stream of typed AG-UI events**, bridged live from the agent's `AgentSession` (`Token` → `TEXT_MESSAGE_*`, `Reasoning` → `REASONING_*` (live model thinking), `ToolCall*` → `TOOL_CALL_*`, `Skill*` → `STEP_*`, wrapped in `RUN_STARTED … RUN_FINISHED`). Same `from(agent)` shape, loopback-only posture, and threat model as the others; hand-rolled SSE, no AG-UI SDK. **agents.kt now serves the agentic web four ways: MCP, A2A, NLWeb, and AG-UI.**
213
213
- **Charge for an agent endpoint over x402 (`X402PaymentGate`, experimental)** — `X402PaymentGate(requirements, facilitator).gate(handler)` wraps any JDK `HttpHandler` so a resource is served only after a settled stablecoin (USDC) payment over the [x402](https://github.com/x402-foundation/x402) protocol (`402 Payment Required`). Pass it straight to the serve surfaces — `NlWebServer.from(agent, payment = gate)` / `AgUiServer.from(agent, payment = gate)` / `A2AServer.from(agent, payment = gate)` — to let an agent **monetize itself** (#4527/#4557; A2A's agent-card discovery stays free). The **safe, seller-side** half: **we hold no key and take no custody** — the buyer signs an EIP-3009 authorization and a *hosted* `FacilitatorClient` verifies + settles on-chain; we only configure a public `payTo`, and the LLM never touches money (gating is at the HTTP layer). **Fails closed** (any failure → `402`, never served unpaid). Buyer-side autonomous payment is deliberately *not* included (it concentrates the irreversible-money risk). New `agents_engine.x402` package, no deps.
214
214
- **AGNTCY interop (OASF record + DIR directory + Identity badge)** — `agent.toOasfRecord(version, authors, locators)` exports an [AGNTCY](https://github.com/agntcy) [OASF](https://github.com/agntcy/oasf) 1.0.0 discovery record (the third exporter beside `agent.json` and the A2A AgentCard; skills carry taxonomy uids via the opt-in `.oasf("agent_orchestration/multi_agent_planning")` annotation against a vendored, drift-checked taxonomy), and `fromOasfRecord(json)` imports + fail-closed-validates it back (#4518/#4519, PRD §12.6). The `:agents-kt-dir` module publishes/discovers records in the AGNTCY **DIR** content-addressed directory over generated grpc-kotlin stubs for three services — `StoreService` (CRUD: `dir.push(agent.toOasfRecord(...))` → CID, `dir.pull(cid)`), `SearchService` (local content search by typed `DirQuery` facet — skill/domain/author/…), and `RoutingService` (`publish`/`routeSearch` for cross-peer network discovery) (#4520). The trust side ships in `:agents-kt-identity`: `IdentityVerifier.verify(compactJws, jwks)` validates an AGNTCY Identity **badge** (a JOSE/JWS-secured W3C Verifiable Credential) against an issuer's `/.well-known/jwks.json`, fail-closed via `nimbus-jose-jwt` (rejects `alg: none`, `HS*` algorithm-confusion, expiry, tamper, wrong/unknown key — #4521). Verify-only; issuance deferred. DIR Routing/Search + OCI referrers are follow-ups under epic #4517.
215
215
-**Prompt caching across providers** — `agent { caching { enabled = true; cacheSystemPrompt = true; cacheToolDefs = true; cacheConversation = Rolling; ttl = 1.hours; cacheable("doc-id") { ... } } }`. Vendor-neutral DSL drives Anthropic's explicit `cache_control` breakpoints (#2658), OpenAI / DeepSeek automatic prefix caching with a stable `prompt_cache_key` routing hint (#2659 / #2661), Ollama / vLLM / SGLang engine-level KV-cache reuse (no-op hints, #2662), and surfaces cache reads + writes + hit-rate on `TokenUsage` (#2663). A prefix-stability guard (#2657) detects silent cache-busters — timestamps, UUIDs, non-deterministic ordering inside cacheable segments — and warns before you pay for a single non-cached run. Off by default; non-breaking. See [docs/caching.md](docs/caching.md).
[AG-UI](https://github.com/ag-ui-protocol/ag-ui) (Agent-User Interaction Protocol) is the **agent↔user/frontend** layer of the interop stack. The standard framing — **MCP = agent↔tools, A2A = agent↔agent, AG-UI = agent↔user** — is stated by AG-UI's own docs, which note the three are complementary and often used together by one agent. It's the only interop layer that reaches an **end-user UI** (a streaming React/[CopilotKit](https://copilotkit.ai) chat surface) without us building a frontend.
2906
2906
@@ -2914,15 +2914,15 @@ Tracking: epic `[interop] AGNTCY support` with subtasks for OASF export, OASF im
2914
2914
|`TEXT_MESSAGE_START/CONTENT/END`| text token deltas |
|`REASONING_*`/ `THINKING_*`| reasoning deltas (already separated from text) |
2917
+
|`REASONING_*`(shipped #4629)| reasoning deltas (`AgentEvent.Reasoning`, already separated from text) |
2918
2918
2919
2919
The whole job is: emit our stream wrapped in the `RUN_STARTED … RUN_FINISHED` envelope over a Micronaut SSE endpoint. Estimated **~1 day**, since we already own the hard part (typed streaming). Frontend/client tools come back as a `ToolMessage` appended to `messages` on the next `POST` (each turn re-posts the full updated history + state).
2920
2920
2921
2921
**Build approach — hand-roll, no SDK dependency.** There is **no first-party JVM SDK**; the community Kotlin and Java SDKs in the repo are **client-side only** (they *consume* a remote agent's stream, they don't *serve* one), so neither helps us. Port the event enum as Kotlin sealed/data classes from the language-neutral protobuf source of truth (`sdks/typescript/packages/proto/src/proto/{events,types,patch}.proto`; the TS Zod `events.ts` is canonical and **docs lag the schema** — build against the schema, ~27–34 event types across lifecycle/text/tool/state/reasoning families). Do **not** adopt Atmosphere or AgentScope-Java — they import a rival agent model that fights our runtime.
2922
2922
2923
2923
**Why deferred.** Nice-to-have, not must-have, and lower priority than AGNTCY (which reaches agents/directories — our likelier near-term consumer). Two caveats kept on record: (1) **governance** — unlike A2A (Linux Foundation) and MCP (Agentic AI Foundation), AG-UI is still single-vendor (CopilotKit/Tawkit), MIT-licensed (no patent grant), not donated to any foundation as of June 2026; mitigated by the spec being small enough that lock-in barely bites. (2) **A2A/AG-UI streaming overlap** is asserted-but-undefended by sources (both use SSE); our read is A2A streams coarse task updates to a *calling agent* while AG-UI streams fine-grained render events to a *browser* — different consumer and granularity, so they compose. Re-evaluate to must-have if AG-UI is donated to a foundation.
2924
2924
2925
-
Tracking: epic #4523`[interop] AG-UI support (agent↔frontend serving)`. **Serve side shipped** — `AgUiServer.from(agent)` (package `agents_engine.agui`): `RunAgentInput` POST → SSE over the JDK `HttpServer`, `AgUiEventBridge` mapping `AgentSession` events into the `RUN_STARTED … RUN_FINISHED` envelope (lifecycle/text/tool/step families). Hand-rolled, no SDK (as planned). Follow-ups: STATE_SNAPSHOT/STATE_DELTA (needs a shared agent↔UI state model), REASONING/THINKING, and client-tool round-trips.
2925
+
Tracking: epic #4523`[interop] AG-UI support (agent↔frontend serving)`. **Serve side shipped** — `AgUiServer.from(agent)` (package `agents_engine.agui`): `RunAgentInput` POST → SSE over the JDK `HttpServer`, `AgUiEventBridge` mapping `AgentSession` events into the `RUN_STARTED … RUN_FINISHED` envelope (lifecycle/text/tool/step families + REASONING #4629 — `AgentEvent.Reasoning` → `REASONING_START`/`_MESSAGE_START`/`_MESSAGE_CONTENT`/`_MESSAGE_END`/`_END`). Hand-rolled, no SDK (as planned). Follow-ups: STATE_SNAPSHOT/STATE_DELTA (needs a shared agent↔UI state model) and client-tool round-trips.
0 commit comments