Observability
Structured logs, metrics, traces, and alerts. Single pane of glass across all providers.
Logs
Every agent emits structured, level-tagged logs (debug, info, warn, error). Logs are streamed in real time via WebSocket and persisted to PostgreSQL for querying. Filter by session, agent, level, or free-text search.
// Stream logs in real time via WebSocket
const stream = theazo.logs.stream({
sessionId: 'ses_...',
level: 'error', // only errors
})
for await (const entry of stream) {
console.log(entry.timestamp) // ISO 8601
console.log(entry.level) // 'error'
console.log(entry.agentId) // 'agt_...'
console.log(entry.message) // structured log message
}SSE streaming
In addition to WebSocket, logs can be consumed via Server-Sent Events (SSE) at GET /v1/agents/:id/logs/stream. SSE is simpler for browser-based dashboards — no WebSocket library needed, just a native EventSource. The server emits two event types: log (with level, message, and timestamp) and heartbeat (keep-alive every 15s).
// Stream logs via SSE (browser-native, no dependencies)
const source = new EventSource(
'https://api.theazo.com/v1/agents/agt_.../logs/stream',
// Note: auth header requires polyfill (e.g. eventsource-polyfill)
// or pass token as query param: ?token=thz_...
)
source.addEventListener('log', (event) => {
const entry = JSON.parse(event.data)
console.log(entry.level) // 'info' | 'warn' | 'error' | 'debug'
console.log(entry.message) // structured log message
console.log(entry.timestamp) // ISO 8601
console.log(entry.agentId) // 'agt_...'
})
source.addEventListener('heartbeat', () => {
// Connection alive — no action needed
})
// Clean up when done
source.close()Querying historical logs
const { logs, total } = await theazo.logs.query({
sessionId: 'ses_...',
agentId: 'agt_...',
level: 'warn',
search: 'timeout',
since: '2025-05-01T00:00:00Z',
until: '2025-05-08T00:00:00Z',
limit: 100,
offset: 0,
})Metrics
Theazo collects platform-level metrics you can query for dashboards and alerting. Key metrics include agents.active, agents.boot_time, cost.total, and error_rate.
const metrics = await theazo.metrics.query({
metric: 'agents.active',
interval: '1h', // 1m, 5m, 1h, 1d
since: '2025-05-01',
until: '2025-05-08',
})
// Returns time series:
// [
// { timestamp: '2025-05-01T00:00:00Z', value: 12 },
// { timestamp: '2025-05-01T01:00:00Z', value: 15 },
// ...
// ]Per-agent cost breakdown
Use theazo.usage.forUser(userId) to get a detailed cost breakdown for a user. The response breaks down spending into model (LLM inference cost), compute (sandbox compute time), and total (combined). All amounts are integer cents. The billingMode field indicates who pays for what — in BYOI modes, the corresponding cost field returns { amount: 0 } because the AgentCo pays the provider directly.
const usage = await theazo.usage.forUser('user_123')
console.log(usage.model) // { amount: 18, currency: 'usd' } — $0.18
console.log(usage.compute) // { amount: 24, currency: 'usd' } — $0.24
console.log(usage.total) // { amount: 42, currency: 'usd' } — $0.42
console.log(usage.billingMode) // 'managed' | 'byoi_compute' | 'byoi_models' | 'byoi_both'
// BYOI example: AgentCo brings their own compute (E2B/Fly credentials)
// → compute cost is 0 (they pay the provider directly)
const byoiUsage = await theazo.usage.forUser('user_byoi...')
console.log(byoiUsage.model) // { amount: 18, currency: 'usd' }
console.log(byoiUsage.compute) // { amount: 0, currency: 'usd' }
console.log(byoiUsage.total) // { amount: 18, currency: 'usd' }
console.log(byoiUsage.billingMode) // 'byoi_compute'Traces
In orchestrator mode (Full Platform and Primitives-only), Theazo creates per-tool-call spans within each agent run. Every model call, tool invocation, and file operation is captured as a span with timing and metadata. In infra-only mode, traces are limited to compute-level events (boot, exec, destroy).
const { traces } = await theazo.traces.list({
agentId: 'agt_...',
})
for (const trace of traces) {
console.log(trace.traceId) // 'trc_...'
console.log(trace.spans) // Span[]
// Each span: { name, startTime, endTime, attributes, status }
}Trace waterfall visualization
Traces render as a waterfall timeline in the dashboard, showing every operation in an agent run as a horizontal bar on a shared time axis. Each span represents one of four operation types: model call, tool execution, file I/O, or sandbox boot. Spans carry a name, duration, start/end timestamps, status (ok or error), and arbitrary metadata. Nested spans show parent-child relationships — for example, a model call that triggers a tool execution which performs file I/O.
Use theazo.traces.get(traceId) to fetch a full trace with nested spans:
const trace = await theazo.traces.get('trc_...')
console.log(trace.traceId) // 'trc_...'
console.log(trace.agentId) // 'agt_...'
console.log(trace.duration) // total duration in ms
console.log(trace.status) // 'ok' | 'error'
// Spans are nested — each span can have children
for (const span of trace.spans) {
console.log(span.name) // 'model.call' | 'tool.exec' | 'file.read' | 'sandbox.boot'
console.log(span.startTime) // ISO 8601
console.log(span.endTime) // ISO 8601
console.log(span.duration) // ms
console.log(span.status) // 'ok' | 'error'
console.log(span.metadata) // { model: 'claude-sonnet-4-20250514', tokens: 1847, ... }
console.log(span.children) // Span[] — nested tool calls, file ops, etc.
}Alerts
Theazo fires alerts for conditions that need attention. Alerts appear in the dashboard and can trigger webhooks.
- Failed agents — an agent threw an unrecoverable error or exceeded max retries.
- Cost limit warnings — a session has used 80% or more of its
maxCostbudget. - Long-running agents — an agent has been running for more than 30 minutes without completing.
// Subscribe to alerts via webhook
// POST https://yourapp.com/webhooks/theazo
{
"event": "agent.failed",
"data": {
"agentId": "agt_...",
"sessionId": "ses_...",
"error": "Max retries exceeded",
"duration": 847,
"cost": { "amount": 42, "currency": "usd" }
}
}Observability by mode
The depth of observability data depends on which mode you operate in. Full Platform and Primitives-only modes provide the richest data because Theazo controls the agent loop. Infra-only mode provides compute-level metrics only.
Method reference
theazo.logs.stream(filters)AsyncIterable<LogEntry>Stream logs in real time via WebSocket. Filter by session, agent, level.theazo.logs.query(filters)Promise<{ logs, total }>Query historical logs with filters, pagination, and search.theazo.metrics.query(opts)Promise<TimeSeries[]>Query metric time series by name, interval, and time range.theazo.traces.list(filters)Promise<{ traces }>List traces with spans for a given agent or session.theazo.traces.get(traceId)Promise<Trace>Fetch a single trace with all nested spans, timing, and metadata.GET /v1/agents/:id/logs/streamSSE streamStream logs via Server-Sent Events. Event types: log, heartbeat.theazo.usage.forUser(userId)Promise<UserUsage>Per-user cost breakdown: model, compute, total, and billing mode.