Tools

Five read-only query primitives plus two side channels, all available to the agent inside its sandboxed REPL.

The five

`schema()`

Describes the event store: distinct services, levels, row count, time window. Usually the agent's first call.

`top_errors(limit=20)`

Ranks error/warn events by frequency. Good for surfacing the loudest failure mode.

`search(pattern, limit=10)`

Case-insensitive substring match across msg and raw. Results ordered by time.

`around(ts, window_s=60, service=None)`

Pulls events within ±window_s of a timestamp. Use it to see what happened around a specific event, optionally filtered to one service.

`trace(trace_id)`

Matches any event whose raw.trace_id or raw.request_id equals the given id. Best way to follow a single request through a multi-service call chain.

Side channels

`llm_query(question, context="")`

Dispatches a free-form question to a secondary LLM. Use sparingly: it counts against max_llm_calls. Good for "is this stack trace characteristic of a GC pause or a lock contention?" style judgement calls.

`submit_incident_report(report: dict)`

Terminal. Validates report against the IncidentReport schema and ends the run. Anything missing or malformed raises a validation error the agent will see; on success the case file is sealed and written.