Skip to content

Coding Agent

The coding agent is a specialist personality with focused file-editing, shell, and git tools. You can reach it from any channel — “ask the coding agent to…”, or just start talking about code and the router will pick it automatically.

  • Read, edit, and create files in a workspace
  • Run shell commands inside a sandbox
  • Use git (status, diff, commit, log, branch, PR create)
  • Search file contents and filenames
  • Follow project instructions from BUTLER.md
ToolWhat it does
file.readRead a file with line numbers
file.writeCreate a new file
file.editReplace a string (exact match)
file.listList a directory, optionally by glob pattern
file.searchRecursive content search
shell.execRun a shell command in the workspace sandbox
git.statusRepository status
git.diffStaged and unstaged changes
git.commitStage files and create a commit
git.logRecent commits
git.branchCreate, list, or switch branches
git.pr_createCreate a pull request using the GitHub CLI

All tools are capability-gated — even if the agent tries to call one, the capability engine checks whether the current channel and session have the corresponding grant.

shell.exec runs inside a workspace sandbox. On Linux this uses unshare namespaces; on macOS it uses sandbox-exec. Commands cannot escape the workspace unless the sandbox is in allow-list mode and the target path is explicitly allowed:

configurations:
sandbox:
mode: workspace-only # off | workspace-only | allow-list
allow_paths:
- /home/me/projects
- /tmp

Like CLAUDE.md in Claude Code, AI Butler reads a BUTLER.md file from the current workspace root. Use it to give the agent project-specific context: test commands, style rules, architecture notes, gotchas.

BUTLER.md
This is a Go 1.26 project. Run tests with `go test -race ./...`.
Use `gofumpt` for formatting. Never use CGO — we compile with `CGO_ENABLED=0`.

See BUTLER.md configuration for the full format.

Hooks let you run shell commands before or after tool calls — useful for auto-formatting, running tests, or blocking dangerous edits:

configurations:
hooks:
post_tool_use:
- command: "gofumpt -w ."
tools: ["file.edit", "file.write"]

See Hooks for the full reference.

Because channels share state, you can start a coding session on the terminal REPL and continue it on Telegram — the agent remembers the workspace, the current branch, and the in-progress work.

You (terminal): run the tests and show me what's failing
Butler: TestAuth_RBAC_DeniesUnauthorized failed at auth_test.go:142
...later, on Telegram...
You: fix the RBAC test
Butler: Working on auth_test.go:142 — the assertion expected 403 but got 401.
I'll update the middleware to return 403 for authenticated-but-unauthorized
requests.

Real conversations — the coding agent at work

Section titled “Real conversations — the coding agent at work”

Three live scenarios against a running AI Butler instance exploring its own codebase. Every screenshot is real tool output.

Sometimes you want a sanity check on a config file or a dependency manifest. Ask naturally:

You: Read the file at /tmp/aibutler-demo/example-project/go.mod and tell me:

  • Which Go version does this project require?
  • How many direct dependencies does it have?
  • Which dependencies are ‘indirect’ vs ‘direct’?
  • Flag anything that looks unusual (pinned versions, replace directives, etc.)

Use your file read tool to actually look at the file, don’t guess.

The agent calls file.read on go.mod, parses the module manifest, and produces a review with 5 flagged observations — including a real typo it caught on the spot:

Agent reading go.mod and flagging 5 observations: 1. 'go 1.26.2 is a non-existent version — almost certainly a typo, should be corrected before CI/toolchain failures'. 2. Three pseudo-versioned untagged dependencies pinned to specific commit hashes (observe-sdk, demangle, wabin). 3. No replace directives (clean). 4. danieljoos/wincred as indirect dep via go-keyring — cross-platform keyring support baked in. 5. extism/go-sdk + wazero — Butler running WASM plugins via Extism. Top action item: 'Fix the go 1.26.2 version — try go 1.23.2 or whatever you're actually running locally'

Notice what the agent did beyond answering the literal question: it spotted a Go version typo, flagged 3 unpinned dependencies with commit hashes, noted the absence of replace directives as a good sign, and explained what each unusual dependency provides. This is the kind of review a human engineer would do when skimming a manifest — except you got it in 15 seconds without leaving the chat.

When you’re debugging or trying to understand existing architecture, grep is your friend. Butler wraps it in natural language:

You: I’m trying to understand how the capability engine is wired into the agent loop. Search the /tmp/aibutler-demo/example-project/internal/agent directory for the string capability.WithCaps and show me every occurrence with file paths and surrounding context. Then explain what each usage is doing in one sentence.

The agent calls file.search, finds every match, and draws you an ASCII flow diagram of how capabilities propagate through the dispatch chain:

Agent searching the codebase for capability.WithCaps and returning a 'How the wiring works' flow diagram: Config.Caps (CapabilitySet) set at agent creation -> Run() -> executeToolsSerial/Parallel -> capability.WithCaps(ctx, a.cfg.Caps) -> Tools.Execute(toolCtx, call) -> inside tool: capability.CapsFromContext(ctx) -> enforce permissions. Plus a 'key design choice' explanation: 'caps are not passed as function arguments to tools — they travel via the context.Context. This keeps the ToolExecutor.Execute signature clean and lets any tool anywhere in the call stack check permissions without needing extra plumbing.'

This is genuinely useful code review. The agent didn’t just grep — it connected the dots between the call sites, inferred the architectural pattern (context-based propagation vs function arguments), and explained why that pattern was chosen. If you’re reading an unfamiliar codebase trying to understand the permission model, this is the kind of answer that saves an hour of tracing calls by hand.

For a new-to-the-repo question, it’s often easier to ask for the big-picture structure:

You: Give me a high-level map of the /tmp/aibutler-demo/example-project/internal directory. List the top-level subdirectories and for each one describe in one sentence what it does based on the folder name and any obvious clues. I want to understand the project’s package organization quickly.

The agent calls file.list on internal/, walks the subdirectories, and returns a categorized package index with ~55 packages grouped by concern (core runtime, channel frontends, skill/tool integrations, protocol infrastructure, utilities):

Agent mapping /tmp/aibutler-demo/example-project/internal: grouped package listings for Core (agent, capability, model, prompt, session, tool), Channels (discord, slack, telegram, webchat, terminal, irc, etc.), Protocol (mcp, protocol, bridge, proxy, services, telemetry, transaction), Utilities (i18n, compat, offline, backup). Closing summary: '~55 packages total. The architecture is cleanly layered: a core agent runtime, pluggable channel frontends, skill/tool integrations, a security stack, and protocol infrastructure (MCP/A2A) — which maps well to your interests in agent interoperability. The swarm, bridge, and protocol packages in particular suggest the multi-agent layer is already taking shape.'

The closing summary — “The architecture is cleanly layered: a core agent runtime, pluggable channel frontends, skill/tool integrations, a security stack, and protocol infrastructure (MCP/A2A) — which maps well to your interests in agent interoperability” — demonstrates something important: the agent is drawing on your memory (it knows you’re interested in A2A + MCP from previous conversations) to tailor its explanation. That’s the full stack working together — file tools + memory graph + natural language — and it’s the reason AI Butler feels different from a plain coding CLI.

What you sawWhy it matters
file.read with structured reviewNot just “cat the file” — the agent produces actionable observations (typos, anti-patterns, unusual deps)
file.search + architectural diagramCode archaeology becomes a 15-second conversation instead of a 20-minute trace
file.list + memory-aware summaryThe agent tailors its explanation to what it already knows about you
Every tool call ran inside the workspace sandboxNo way to accidentally read outside the allowed paths
Every call was capability-gated + audit-loggedThe same security posture that protects your smart home protects your codebase