Agentic incident investigation, driven from your browser.
Triagent is a localhost web app that pairs the Claude reasoning agent with read-only
Kubernetes access, an extensible MCP catalog (Prometheus, Slack, GitHub,
incident.io, your own), a guided playbook walker, and a persistent team wiki. You run triagent start, hand it the
symptom, and it drives a focused diagnosis you can paste into a ticket when it's done.
Every tool call stays visible, so you can audit the chain or interrupt at any point. Finished sessions can be shared so the next operator starts from where you ended, not from the alert.
📚 Read the full documentation →
Kubernetes triage isn't a kubectl command. It's a multi-tab scramble across half a dozen surfaces. Triagent collapses
that scramble into one conversation against one audit trail:
- The agent reads the procedure, doesn't memorise it. Domain knowledge lives in playbooks loaded at runtime, not in a system prompt or a fine-tune. Updating what the system can diagnose is a YAML edit.
- The tools are a typed catalog, not a shell. Every action the agent can take is a curated MCP tool with a schema'd input. The agent can't go off-piste, and the catalog doubles as documentation.
- Knowledge accumulates as data. Each investigation can deposit a playbook (procedural) or a wiki entry (factual). Tomorrow's recall is a single tool call instead of a Slack archaeology dig.
With watches on the source (Slack channels, GitHub issue queries, more on the way), the launcher pre-classifies new items and proposes investigations on its own. With auto mode on, routine ones run end-to-end before you've read the page. You can take over at any moment.
Four surfaces, each documented in depth on the docs site:
- Investigations: the live triage view. Hand the agent a symptom and a context (cluster, Slack thread, incident.io link, notes), watch the walker drive the diagnosis, ship the markdown summary.
- Playbooks: the YAML-defined guided walker the agent follows. Author them in-browser with an AI co-editor.
- Wiki: the team's persistent knowledge base of failure patterns and prior art, queryable by the agent.
- Watches: polling rules that turn Slack messages, GitHub issues, or alerts into proposed investigations.
|
Typed tool catalog, not a shell. Every action the agent can take is a schema'd MCP call. The same surface the agent reads is the surface you author against. |
Playbooks as data. YAML graphs the walker follows, authored in-browser with an AI co-editor and shipped as PRs to the playbooks repo. |
|
Wiki that compounds. Every finished investigation can deposit an entry; tomorrow's recall is a single tool call instead of a Slack archaeology dig. |
Watches close the loop. Slack channels and GitHub queries become pre-classified signals. Routine ones auto-spawn an investigation before the pager fires. |
claudeCLI on$PATH, authenticated. See Claude Code.- A working kubeconfig with read access to the namespace you want to triage. Triagent talks to the cluster via
client-go.
kubectlis not required but most operators have it. tshif you use Teleport-backed cluster discovery (optional).- Kubernetes permissions to read pods/logs in the target namespace. Triagent does not create RBAC. It refuses to start if your existing permissions are insufficient.
macOS / Linux:
curl -fsSL https://sourcehawk.github.io/triagent/install.sh | shWindows (PowerShell):
irm https://sourcehawk.github.io/triagent/install.ps1 | iexHomebrew (macOS):
brew install --cask sourcehawk/tap/triagentManual download: grab the archive for your OS/arch from the latest
release and put
triagent + triagent-mcp somewhere on your $PATH.
The install script downloads both triagent (the launcher) and triagent-mcp
(the MCP multiplexer) to ~/.local/bin (or %LOCALAPPDATA%\Programs\triagent
on Windows). The launcher locates triagent-mcp adjacent to itself or anywhere
on $PATH. The Next.js frontend is embedded in the launcher, so the runtime
ships as a single executable per binary.
Build from source (requires Node 20+ and Go; see .tool-versions):
make buildtriagent startThis boots a localhost HTTP server, prints its URL with a per-launch token, and opens your browser to it. Press
Ctrl-C to stop. It works out of the box on the embedded default profile; see
Customising the profile below to teach the agent your stack and wire upstream repos.
In the browser:
- Pick a cluster: directly from kubeconfig, or via Teleport.
- Log in if prompted (SSO/2FA prompts go to the launcher terminal).
- Enter the namespace and optional notes, Slack channel, or incident URL.
- Preflight runs: namespace exists, you can list pods. If anything's missing, the launcher tells you why and stops.
- Investigate: the agent walks the playbook, calls tools, and writes a summary you can copy or push upstream as a PR (once you've wired an upstream repo; see below).
triagent start # boot the launcher
triagent start --profile my-profile # use a custom embedded profile by name
triagent start --profile ./my-prof # use an on-disk profile (dir or yaml path)
triagent create-profile my-team # fork the embedded default into ./my-team/ for editing
triagent clean # reset launcher caches (sessions, clones, etc.)
triagent clean --dry-run # show what would be deleted--profile accepts either an embedded profile name or a filesystem path; TRIAGENT_PROFILE is the env-var
equivalent.
A profile is the deployment-specific config that fits triagent to your platform: which playbooks the agent walks,
which MCPs attach, what the preflight form asks for, and what the agent already knows about your stack before it
starts. The embedded default runs as-is but is platform-neutral. Customising the profile is the
highest-leverage step in a triagent setup. Two overrides matter most:
architecture.md: the prompt the agent reads before every triage. Teach it your platform's CRDs, namespace conventions, dependency direction, and recurring failure modes. Every investigation starts informed instead of rediscovering your stack.- Upstream repos (
defaults.playbooks_repo,defaults.wiki_repo,defaults.sessions_repo): the GitHub repos backing the playbook set, team wiki, and committed session transcripts. Wiring these enables sync-from-upstream and push-as-PR; without them, edits stay local-only. Each repo is independent; wire any subset.
The recommended setup is a tiny overlay that inherits from default and only spells out what you're overriding:
mkdir -p ~/.config/triagent/profile
cat > ~/.config/triagent/profile/profile.yaml <<'YAML'
name: my-team
base: default
defaults:
playbooks_repo: my-org/triagent-playbooks # GitHub OWNER/REPO
wiki_repo: my-org/triagent-wiki
sessions_repo: my-org/triagent-sessions
prompt_files:
architecture.md: architecture.md
YAML
$EDITOR ~/.config/triagent/profile/architecture.md # describe your platform
triagent start --profile ~/.config/triagent/profileEverything you leave out (paths, other prompts, investigation inputs, kinds.json, extra MCPs, Prometheus, model
selection, auth) is inherited from default. See
Profiles for the full schema, alternative layouts (full fork
via triagent create-profile, air-gapped mode), and the longer narrative on each block.
PRs welcome. See DEVELOPER_GUIDE.md for the full contributor setup, CLAUDE.md for the durable conventions, and open issues for ideas worth picking up.
Quick loop:
make test # Go race tests + frontend vitest (wholesale)
make lint # Go lint
make build # frontend bundle + both binaries
# UI dev loop (no Go rebuild for frontend changes):
go run . start # terminal 1
cd frontend && npm run dev # terminal 2, proxies /api/* to :8080



