API Reference

On-device control API · Android (MetricsService) and iPhone / iPad (control sidecar) · local HTTP, bearer token, off by default

Overview

Both TokForge apps embed a small HTTP control server on the device. It lets an AI agent, a developer, or a home-automation tool drive the app headlessly: discover hardware, load models, run inference and benchmarks, manage conversations, characters, memory, image generation, voice, and more, all over local HTTP.

The server is off by default on both platforms, requires a bearer token, and is meant for your own trusted network. It is not a cloud API and there is nothing to sign up for. Never expose the port to the public internet.

What you can do over the control plane:

Read device hardware, thermal, and inference state
List, download, load, and unload models
Run chat, roleplay, and group-chat turns, and stream tokens
Generate images and manage image models, LoRAs, and reference images
Manage characters, personas, lorebooks, memory, and documents (RAG)
Run agents and tool loops, and opt-in web search
Run benchmarks and auto-tune, and drive the live UI for testing

Platforms: Android & iOS

Each platform exposes its own control plane with the same concepts and mostly parallel routes. Pick a platform below and the reference switches to match.

Area	Android	iPhone & iPad
Control plane	MetricsService: NanoHTTPD, default port `8088`	Control sidecar: HTTP/1.1, default port `8765`
Enable	Settings → Advanced mode → Developer tools → Metrics Server	Settings → enable the control server (advanced), opt-in
Default bind	Release: `127.0.0.1`; opt in to `0.0.0.0` for LAN	Loopback `127.0.0.1`; dev-only LAN opt-in
Auth	Bearer token	Bearer token
LLM engines	llama.cpp (GGUF), MNN (CPU / OpenCL), remote API	llama.cpp + Metal, MNN, MLX, CoreML (lab), remote endpoint
Image generation	Stable Diffusion (CPU / OpenCL), optional Hexagon NPU	Apple CoreML on the Neural Engine, plus stable-diffusion.cpp on 12 GB
Voice / vision	Offline TTS / voice cloning, Whisper STT; MNN vision	Apple TTS/STT plus offline voices; Apple Vision, MNN vision (gated)

Both control planes are off by default and require a token. They are for local automation, observability, and end-to-end driving on your own network. The route counts are large and evolve build to build: this page documents the major families with real examples rather than every route.

Enable the server

The MetricsService is disabled by default. To turn it on:

Open TokForge → Settings and switch on Advanced mode (toggle at the top of Settings).
Open the Developer tools section, then find Metrics Server and toggle it on.
Choose a Bind Host:
- Localhost only (127.0.0.1): reachable only from the device itself or through ADB port forwarding. Best for development.
- All interfaces (0.0.0.0): reachable from any device on your LAN, for example http://192.168.1.50:8088/health.
Set the Port (default 8088) and set or regenerate the Metrics Auth Token.

Release builds honour your Bind Host choice. A normal Play Store build binds 0.0.0.0 and is reachable over the LAN when you select All interfaces; you do not need a debug build or ADB. The only build-type difference is the first-run default and whether auth can be skipped on loopback:

Build	First-run bind	Localhost (`127.0.0.1`)	All interfaces (`0.0.0.0`)
Debug	`0.0.0.0`	Auth bypassed	Auth required
Release	`127.0.0.1`	Auth required	Auth required

Security note. With All interfaces, anything on your LAN that can reach the phone can reach the API. Keep auth on, use a strong token, and never forward the port to the internet.

Authentication

Nearly every route needs Authorization: Bearer <token>. The token lives in Settings → Developer tools → Metrics Server (the Metrics Auth Token field); tap Regenerate to rotate it, or rotate over the API:

curl -X POST -H "Authorization: Bearer CURRENT_TOKEN" \
     http://localhost:8088/control/rotate-auth-token
# -> {"status":"ok","auth_token":"NEW_TOKEN","token_rotated":true}

A missing or invalid token returns 401:

{"error": "Unauthorized: provide Bearer token in Authorization header"}

Unauthenticated endpoints

Exactly three routes need no token, plus the loopback-gated bootstrap. Everything else, including /state/hardware, requires a bearer token.

Endpoint	Description
`GET /health`	Server status, loaded models, uptime
`GET /version`	App version, build type, device
`GET /metrics`	Loaded-model state
`GET/POST /auth/bootstrap`	Loopback only. Returns the current token. On release builds it also requires a shell-proven headless session; the simple path is to copy the token from Settings.

Connecting

Over the LAN (All interfaces). Find the phone's IP in Settings → Wi-Fi (tap the network) and use it as the base URL. /health needs no token; everything else does:

BASE="http://192.168.1.50:8088"
AUTH="Authorization: Bearer YOUR_TOKEN"

curl -s "$BASE/health" | jq
curl -s -H "$AUTH" "$BASE/state" | jq

Over ADB (Localhost, or when the router blocks device-to-device traffic). Forward the port over a cable:

adb forward tcp:8088 tcp:8088
curl -s http://localhost:8088/health | jq

# Multiple devices: use a unique local port each
adb -s DEVICE2_SERIAL forward tcp:8089 tcp:8088

If /health times out over Wi-Fi even though both devices are on the same network, the router is probably isolating clients (AP / client isolation). Use ADB forwarding instead.

Health & state

Method	Route	Purpose
`GET`	`/health`	Backends loaded, uptime, request counters No auth
`GET`	`/version`	App / build / device No auth
`GET`	`/metrics`	Loaded-model flags No auth
`GET`	`/state`	Full snapshot (settings, memory, battery, thermal)
`GET`	`/state/inference`	Loaded models, backend availability, generation status
`GET`	`/state/hardware`	SoC, cores, GPU, RAM, recommended config Auth
`GET`	`/state/settings`	User settings and endpoints
`GET`	`/storage`	Model directory and disk usage
`GET`	`/performance`	Live memory / thermal / last-generation stats
`GET`	`/control/generation-status`	Real in-flight inference lock state

curl -s http://localhost:8088/health | jq
curl -s -H "$AUTH" http://localhost:8088/state/hardware | jq '.soc_model, .total_ram_mb'

Models & downloads

Long-running calls (load, download, scan) return an operation_id in their 2xx body. Poll GET /control/operation/{id} for the real lifecycle (queued → running → completed | failed, with download percentage).

Method	Route	Purpose
`GET`	`/models`	Installed models with metadata
`GET`	`/models/{id}`	One model by database id
`POST`	`/control/load-model`	Load a model (returns `operation_id`; 409 `insufficient_ram` unless forced)
`POST`	`/control/load-model-path`	Load any on-disk model by absolute path
`POST`	`/control/unload-all`	Unload everything
`POST`	`/control/switch-backend`	Switch MNN / GGUF / remote (409 mid-generation)
`POST`	`/control/download-model`	Download by URL (returns `operation_id`; re-issuing resumes)
`POST`	`/control/cancel-download`	Cancel; keeps `.part` files for resume
`POST`	`/models/search/query`	Hugging Face search (then `/models/search/download-file`)
`POST`	`/control/install-model`	Install a local tar / tar.gz MNN bundle
`POST`	`/control/scan-models`	Register newly added model files
`POST`	`/control/delete-model`	Delete a model

# Load model 1, then poll the operation to confirm it finished
OP=$(curl -s -H "$AUTH" -H 'Content-Type: application/json' \
       -X POST "$BASE/control/load-model" -d '{"model_id":1}' | jq -r '.operation_id')
while :; do
  S=$(curl -s -H "$AUTH" "$BASE/control/operation/$OP" | jq -r '.status')
  [ "$S" = completed ] && break
  [ "$S" = failed ] && { echo "load failed"; break; }
  sleep 1
done

Chat & conversations

Two ways to run a turn. The direct synchronous path is POST /conversations/{id}/send or the unified POST /api/chat. The UI-automation path (/control/chat/type then /control/chat/send) drives the real visible chat and returns immediately; poll /control/generation-status and the messages route for progress.

Method	Route	Purpose
`POST`	`/control/create-conversation`	Create (requires `character_id`; `navigate:true` to open it)
`POST`	`/conversations/{id}/send`	Direct synchronous send (`text` or `message`); group-aware
`POST`	`/api/chat`	Unified end-to-end chat (accepts `image_path` / `image_base64` for vision)
`POST`	`/control/chat/type`	Type a draft into the live composer
`POST`	`/control/chat/send`	Queue the send the UI uses; returns immediately
`POST`	`/control/chat/regenerate`	Add another candidate reply
`POST`	`/control/chat/select-response`	Pick a response variant (`variant_index` or `direction`)
`POST`	`/control/stop-generation`	Stop; `?await=true` blocks until idle
`GET`	`/conversations/{id}/messages`	Messages with response variants
`GET`	`/conversations`	List conversations
`DELETE`	`/conversations/{id}`	Delete a conversation

CID=$(curl -s -H "$AUTH" -H 'Content-Type: application/json' \
  -d '{"character_id":1,"navigate":true}' \
  "$BASE/control/create-conversation" | jq -r '.conversation_id')

curl -s -H "$AUTH" -H 'Content-Type: application/json' \
  -d "{\"conversation_id\":$CID,\"text\":\"Reply with exactly: PONG\"}" \
  "$BASE/conversations/$CID/send" | jq '.finish_reason, .content'

Every terminal reply from /conversations/{id}/send and /api/chat carries done, finish_reason (stop / length / cancelled / error / timeout), tokens_emitted, and elapsed_ms. A clean answer is done == true and finish_reason in {stop, length}. A timeout means the HTTP wall fired but generation continues on the device (re-read the conversation).

Group chat

On a group conversation, the same send routes run the group turn policy (speaker selection, per-member persona and memory). Pin a speaker with member_id or character_id in the body.

Method	Route	Purpose
`POST`	`/control/conversations/{id}/group`	Promote to group, set roster and activation strategy
`GET`	`/control/conversations/{id}/members`	List members
`POST`	`/control/conversations/{id}/members`	Add / update a member (`turn_order`, `is_muted`, `talkativeness`, `voice_ref`); seat conflict returns 409 `seat_occupied`
`DELETE`	`/control/conversations/{id}/members/{character_id}`	Remove a member (the primary cannot be removed)

Memory & RAG

Method	Route	Purpose
`GET / POST`	`/memory/facts`	List / create facts (`?await_embedding=true` blocks until indexed)
`POST`	`/memory/facts/{id}/pin`	Pin a fact
`GET`	`/memory/edges`	Knowledge-graph edges
`POST`	`/memory/documents/ingest`	Ingest a document (text or URL) for RAG
`GET`	`/memory/documents`	List ingested documents
`POST`	`/rag/search`	Retrieve chunks for a query
`GET`	`/memory/search`	Search across facts and documents
`GET`	`/memory/stats`	Counts and storage usage
`POST`	`/control/reflect`	Trigger background memory extraction

curl -s -H "$AUTH" -H 'Content-Type: application/json' \
  -X POST "$BASE/rag/search" -d '{"query":"what does the contract say about refunds?"}' | jq

Image generation

Text-to-image is a guarded chat feature. For a release-faithful result, drive it through the chat path (create conversation, select an image model with /control/chat/image-models/select, then type and send). The direct generate routes below are the headless equivalent.

Method	Route	Purpose
`GET`	`/control/imagegen/status`	Selector gates, download progress, route labels
`POST`	`/control/imagegen/select-model`	Choose an image model
`POST`	`/control/imagegen/generate`	Render (507 `image_generation_low_memory` if the memory gate refuses)
`POST`	`/control/imagegen/batch`	Batch render (202 with `batch_id`)
`GET / POST / DELETE`	`/control/imagegen/reference`	Reference-image identity (img2img)
`GET / POST / DELETE`	`/control/imagegen/lora`	Manage LoRAs
`GET`	`/conversations/{id}/messages/{message_id}/image`	Export a generated image (PNG)

Voice & TTS

Method	Route	Purpose
`GET`	`/control/voice/list`	Available voices
`POST`	`/control/voice/select`	Select a voice
`POST`	`/control/tts/speak`	Speak text aloud on the device
`GET`	`/control/tts/status`	TTS status
`POST`	`/control/tts/stop`	Stop playback
`POST`	`/control/install-tts-pack`	Install an offline TTS voice pack

POST /control/voice/clone and DELETE /control/voice/{id} are debug-build only.

Agents

Agents run a tool loop over whatever local backend is loaded (MNN included). They only refuse when nothing is loaded.

Method	Route	Purpose
`GET`	`/control/agent/list`	List agents (alias `/control/agents`)
`POST`	`/control/agent/run`	Run an agent
`GET`	`/control/agent/status`	Run status and results
`POST`	`/control/agent/cancel`	Cancel a run
`POST`	`/control/agent/save` / `delete`	Save or delete (built-ins are fork-on-edit editable)

Benchmark & ForgeLab

Method	Route	Purpose
`POST`	`/benchmark/run`	Run one benchmark (returns `operation_id`)
`GET`	`/benchmark/results` / `matrix`	Query results, or a cross-device matrix
`POST`	`/benchmark/auto-matrix`	Sweep model / backend combinations
`GET`	`/benchmark/export`	Export results (JSON / CSV)
`POST`	`/control/auto-tune`	Sweep configs, recommend optimal settings (returns `operation_id`)
`GET`	`/control/autoforge/plan` / `candidates`	AutoForge route plan and candidates
`GET / POST`	`/forge/profiles`, `/forge/share`	Forge profiles and leaderboard share

curl -s -H "$AUTH" -H 'Content-Type: application/json' \
  -X POST "$BASE/benchmark/run" \
  -d '{"prompt":"Benchmark prompt","num_tokens":100}' | jq '.operation_id'

Settings & backup

Method	Route	Purpose
`GET`	`/settings` / `/settings/{key}`	Read settings (unknown key: 404)
`POST`	`/settings/{key}`	Set one value (invalid value: 422 `invalid_settings_value`)
`POST`	`/control/set-settings`	Bulk settings update
`POST`	`/control/set-inference-config`	Sampler / context / KV / attention config
`GET`	`/debug/export`	Export full app state as a ZIP (models, conversations, settings, logs)

Debug-only surfaces

Some routes exist only on debug builds and are stripped from the release APK (they answer 404 on release):

GET/POST /debug/ui/*: the UI-drive harness (tree / tap / text / scroll / await) for driving Compose screens over HTTP.
POST /debug/crash-test, POST /control/import-model-path, POST /control/image-models/import, POST /control/voice/clone, POST /control/tq3_trace.

A few observability routes are release-available (for example GET /debug/log-tail, GET /debug/tombstones, GET /debug/generation-trace, GET /debug/inference-readiness), still behind the bearer token.

Errors & guards

The server returns honest status codes rather than a hollow 200:

Code	Meaning
`400`	Malformed input
`401`	Missing or bad bearer token
`404`	Unknown id or route
`405`	Path exists under another verb (an `Allow` header is set)
`409`	Busy or wrong state (for example `generation_in_flight`; pass `?force=true` where allowed)
`413`	`payload_too_large`: the request body or attachment exceeds the cap
`422`	Invalid settings value or a bodied DELETE on a query-only route
`503`	A benchmark harness holds the server-side bench lock; the body names the holder
`507`	`image_generation_low_memory`: the image memory gate refused
`500`	Uncaught error (detail redacted when bound to `0.0.0.0`)

Mutation guards: switch-backend, clear-generated-caches, and reset-state(scope=all) answer 409 while a generation is in flight. To serialize cleanly, stop first with POST /control/stop-generation?await=true.

Full catalog

This page documents the major families with runnable examples. The live server is always the source of truth: a release build exposes roughly 280 routes across about two dozen handler groups, and the exact set for your build is what the app registers. Start from GET /health and GET /state/inference, then explore the families above. Route names and paths here are used exactly as the app implements them.

iOS control plane (iPhone & iPad)

The iOS app embeds an opt-in, token-authed HTTP control server (the "sidecar"), the counterpart to the Android MetricsService. It drives chat, models, image generation, voice, characters, memory, navigation, and the live SwiftUI view layer. Roughly 200 method-and-path operations are exposed. Engines are Apple-native: llama.cpp with Metal and MNN for LLMs, MLX for the fastest big-model route on 12 GB devices, CoreML / Neural Engine as a lab route, and CoreML plus stable-diffusion.cpp for image generation.

Enable the server

Off by default, fail-closed. The server never starts on its own. Turn on Enable control server (advanced) in Settings.
Port 8765 by default (configurable in Settings).
Loopback only by default (127.0.0.1). A dev-only LAN mode is a second opt-in that binds the Wi-Fi interface, still bearer-gated. It is never bound on cellular.

# On the device (loopback): fetch the token, then call health
TOKEN=$(curl -s http://127.0.0.1:8765/auth/bootstrap | jq -r .auth_token)
curl -s -H "Authorization: Bearer $TOKEN" http://127.0.0.1:8765/health | jq

Auth & lifecycle

Every route needs Authorization: Bearer <token> except the pre-auth probes: GET /health, GET /livez, GET /version, GET /whoami, and GET/POST /auth/bootstrap. The token is a persisted 256-bit value; a custom token set in Settings overrides it.

GET /auth/bootstrap hands out the token to loopback callers only, in every posture. In LAN mode a remote client cannot fetch it; read it from Settings and pass it in.
GET /whoami reports bind posture, port, and LAN addresses, and never contains the token.

Foreground-only lifecycle. The server binds early in app launch, but iOS suspends a backgrounded app, so the port goes dark shortly after the app leaves the foreground and stays dark if the app is jetsammed. Screenshot and UI-drive routes also answer 409 when the app is not renderable. For automation: keep the app foregrounded, and when a device goes quiet, relaunch the app rather than retrying curl.

Conventions

JSON in, JSON out (screenshot, exports, and image byte routes return raw bytes). Request bodies are capped at 4 MiB. Most keys accept camelCase and snake_case.
Sandbox-confined file paths. Any route that takes a file path (@/abs/path or a bare absolute path) is confined to the app's Documents / tmp / Caches subtree. There is no shell and no arbitrary file read. GET /control/diagnostics/docs-path reports the sandbox roots.
Async jobs. Long-running work (image generation, agent runs, downloads, forge sweeps) returns 202 {"status":"running","job_id":...}. Poll the family's .../status?job_id= route. A failed job polls as a 500-status body with the real error.
Honest errors. Bad input is 400; unknown ids 404; busy or wrong-state 409; memory refusals 507; a UI-await timeout 408; a thrown backend error 500.

Health & state

Method	Route	Purpose
`GET`	`/livez`	Instant liveness, no backend hop
`GET`	`/health`	Loaded model, engine, busy, thermal, perf config
`GET`	`/version`	App version / build
`GET`	`/whoami`	Bind posture, port, LAN addresses (no token)
`GET`	`/state`	Loaded model, model-ready, conversation count
`GET`	`/state/hardware`	Machine / chip / RAM / cores
`GET`	`/state/condition`	Thermal, memory budget, load refusal
`GET`	`/telemetry`	Per-turn tok/s records
`GET`	`/control/generation-status`	Whether a generation is in flight
`GET`	`/control/runtime/routes`	Default / candidate / lab route snapshot

Chat & conversations

Method	Route	Purpose
`POST`	`/chat` (alias `/control/chat/send`)	Blocking chat completion; returns reply + stats
`POST`	`/control/chat/stream`	Streaming chat (SSE tokens)
`POST`	`/control/chat/type`	Type into the live composer
`POST`	`/control/chat/regenerate`	Regenerate the last reply
`POST`	`/control/chat/edit`	Edit a message
`POST`	`/control/chat/read-aloud`	Read a message aloud (TTS)
`POST`	`/control/stop-generation`	Stop the in-flight generation
`GET / POST`	`/conversations`	List / create a conversation
`GET`	`/control/conversations/{id}/messages`	Paged messages
`POST`	`/control/conversations/import`	Import chats (JSON / JSONL)

B=http://127.0.0.1:8765
AUTH="Authorization: Bearer $TOKEN"

curl -s -H "$AUTH" -X POST $B/control/models/select \
  -d '{"name":"Qwen3-4B-Instruct-Q4_K_M.gguf"}'
CID=$(curl -s -H "$AUTH" -X POST $B/conversations -d '{"title":"api-test"}' | jq -r .id)
curl -s -H "$AUTH" -X POST $B/control/chat/send \
  -d "{\"conversationId\":$CID,\"text\":\"Reply with exactly: PONG\"}" | jq .reply

Groups

Method	Route	Purpose
`POST`	`/control/create-group`	Create a multi-character group chat
`GET / POST`	`/control/conversations/{id}/members`	List / add members
`POST`	`/control/conversations/{id}/members/reorder`	Set turn order
`POST`	`/control/conversations/{id}/strategy`	Turn strategy (natural / round-robin / manual)
`POST`	`/control/conversations/{id}/respond`	Group turn with no new user message (one speaker, or cascade)

GID=$(curl -s -H "$AUTH" -X POST $B/control/create-group -d '{"title":"api-group"}' | jq -r .id)
curl -s -H "$AUTH" -X POST $B/control/conversations/$GID/members -d '{"characterId":1}'
curl -s -H "$AUTH" -X POST $B/control/conversations/$GID/respond -d '{"cascade":true,"maxTurns":2}'

Models

Method	Route	Purpose
`GET`	`/models`	Installed models with compatibility fields
`GET`	`/control/models/catalog`	Curated installable rows
`GET`	`/control/models/search`	Hugging Face search (`?q=&format=chat\|mnn\|image`)
`POST`	`/control/models/download`	Start a download (202 + job id); staged bytes resume
`GET`	`/control/models/download/status`	Download progress
`POST`	`/control/models/use`	Select an installed model or start its download
`POST`	`/control/models/select`	Load an installed model by filename
`POST`	`/control/unload`	Free the resident model
`POST`	`/control/switch-backend`	Switch llama / MNN
`POST`	`/control/llm/load-mlx`	Load an MLX model directory
`POST`	`/control/spec-decode`	Attach or disable a speculative-decode draft

The /control/coreml-llm/* routes drive the CoreML / Neural Engine LLM. It is capability-gated (iOS 18, A17 or newer, 8 GB or more) and lab-only; there is no user UI for it.

Image generation

On-device Stable Diffusion via Apple CoreML on the Neural Engine, plus stable-diffusion.cpp on 12 GB devices.

Method	Route	Purpose
`POST`	`/control/imagegen/generate`	Generate (202 + `job_id`; batch returns `jobIds`)
`GET`	`/control/imagegen/status`	Poll a job
`GET`	`/control/imagegen/result`	Finished job's PNG bytes
`GET`	`/control/imagegen/models`	Installed and catalog image models
`POST`	`/control/imagegen/use`	Select or install an image model
`POST / DELETE`	`/control/imagegen/lora`	Add / remove a LoRA
`POST / DELETE`	`/control/imagegen/reference`	Set / clear an img2img reference
`GET`	`/control/imagegen/gallery`	Saved images

JOB=$(curl -s -H "$AUTH" -X POST $B/control/imagegen/generate \
  -d '{"prompt":"a bonsai tree, watercolor","steps":20}' | jq -r .job_id)
until curl -s -H "$AUTH" "$B/control/imagegen/status?job_id=$JOB" | jq -e '.state=="done"' >/dev/null; do sleep 2; done
curl -s -H "$AUTH" "$B/control/imagegen/result?id=$JOB" -o out.png

Characters & personas

Method	Route	Purpose
`GET`	`/characters`	List characters
`POST`	`/control/characters`	Create a character
`POST`	`/control/characters/import`	Import a V2 card (JSON or the raw object, optional base64 avatar)
`GET / POST`	`/control/characters/chub/search`	Browse chub.ai
`POST`	`/control/characters/chub/import`	Import from chub.ai
`GET / POST`	`/control/personas`	List / create personas
`GET / POST`	`/control/lorebooks`	List / create lorebooks (world info)
`GET / POST`	`/control/prompts`	PromptLab presets

Voice & vision

Method	Route	Purpose
`POST`	`/control/tts/speak`	Text-to-speech (honest states for missing model / no PCM / RAM refusal)
`GET`	`/control/voice/list`	Apple system voices plus neural speakers
`POST`	`/control/voice/select`	Select a voice
`POST`	`/control/voice/install`	Install a voice engine (Kokoro / Whisper), 202 + job id
`GET / POST`	`/control/voiceclone`	List / create a cloned voice
`POST`	`/control/stt/transcribe`	Speech-to-text (on-device Whisper)
`POST`	`/control/vision/describe`	Describe an image (Apple Vision tags, OCR, caption)

Memory & RAG

Method	Route	Purpose
`GET`	`/rag/search`	Top-k chunk hits (`?q=&topK=`)
`GET`	`/rag/diagnostics`	Why a search is empty vs indexed docs
`GET`	`/memory/list`	Indexed documents
`POST`	`/memory/import-text`	Import text into memory
`POST`	`/control/memory/import-file`	Import a document (sandbox-confined path)
`GET / POST`	`/control/facts`	List / create facts
`GET`	`/control/knowledge`	Knowledge graph from the facts store
`POST`	`/control/memory/reflect`	Extract knowledge over the active conversation

Agents & web search

Method	Route	Purpose
`GET`	`/control/agents`	Built-in and saved agents
`POST`	`/control/agent/run`	Run an agent (202 + job id, real tool loop)
`GET`	`/control/agent/status`	Final text, steps, tool results
`GET / POST`	`/control/websearch`	Opt-in web search config
`POST`	`/control/websearch/query`	Run a search query

UI drive & backup

The /control/ui/* layer reads and drives the live SwiftUI view layer through an explicit control registry (a plain accessibility walk yields nothing in-process on iOS). Some flows are drivable this way; some are not yet instrumented (drive those over the REST families instead).

Method	Route	Purpose
`POST`	`/control/navigate`	Navigate to a screen
`GET`	`/control/ui/tree`	Registered controls on the foreground screen
`POST`	`/control/ui/tap` / `text` / `await`	Tap, type, or wait on a node by id or label
`GET`	`/control/screenshot`	PNG of the key window (409 when not foreground)
`GET`	`/control/applog`	Crash-survivable log ring
`GET`	`/control/debug-bundle`	Zip of log + health + state + telemetry
`POST`	`/control/backup/export`	Whole-app backup (`?download=1` streams the bytes)
`POST`	`/control/backup/restore`	Restore a backup (merge or replace)
`POST`	`/control/maintenance/clear-cache`	Clear caches

POST /bench (a self-contained micro-bench) is debug-build only and answers 404 in release; use POST /control/bench/run for the release-safe bench primitive.

Full catalog

This page covers the major families with runnable examples. The live sidecar is the source of truth: GET /control/runtime/routes returns the current route snapshot for your build, and around 200 operations are exposed in total. Route names and paths here are used exactly as the app implements them.