Get Started

Observability

Structured logs, metrics, traces, and alerts. Single pane of glass across all providers.

Logs

Every agent emits structured, level-tagged logs (debug, info, warn, error). Logs are streamed in real time via WebSocket and persisted to PostgreSQL for querying. Filter by session, agent, level, or free-text search.

logs.ts
// Stream logs in real time via WebSocket
const stream = theazo.logs.stream({
  sessionId: 'ses_...',
  level: 'error',        // only errors
})

for await (const entry of stream) {
  console.log(entry.timestamp)  // ISO 8601
  console.log(entry.level)      // 'error'
  console.log(entry.agentId)    // 'agt_...'
  console.log(entry.message)    // structured log message
}

SSE streaming

In addition to WebSocket, logs can be consumed via Server-Sent Events (SSE) at GET /v1/agents/:id/logs/stream. SSE is simpler for browser-based dashboards — no WebSocket library needed, just a native EventSource. The server emits two event types: log (with level, message, and timestamp) and heartbeat (keep-alive every 15s).

sse-logs.ts
// Stream logs via SSE (browser-native, no dependencies)
const source = new EventSource(
  'https://api.theazo.com/v1/agents/agt_.../logs/stream',
  // Note: auth header requires polyfill (e.g. eventsource-polyfill)
  // or pass token as query param: ?token=thz_...
)

source.addEventListener('log', (event) => {
  const entry = JSON.parse(event.data)
  console.log(entry.level)      // 'info' | 'warn' | 'error' | 'debug'
  console.log(entry.message)    // structured log message
  console.log(entry.timestamp)  // ISO 8601
  console.log(entry.agentId)    // 'agt_...'
})

source.addEventListener('heartbeat', () => {
  // Connection alive — no action needed
})

// Clean up when done
source.close()

Querying historical logs

query-logs.ts
const { logs, total } = await theazo.logs.query({
  sessionId: 'ses_...',
  agentId: 'agt_...',
  level: 'warn',
  search: 'timeout',
  since: '2025-05-01T00:00:00Z',
  until: '2025-05-08T00:00:00Z',
  limit: 100,
  offset: 0,
})

Metrics

Theazo collects platform-level metrics you can query for dashboards and alerting. Key metrics include agents.active, agents.boot_time, cost.total, and error_rate.

metrics.ts
const metrics = await theazo.metrics.query({
  metric: 'agents.active',
  interval: '1h',           // 1m, 5m, 1h, 1d
  since: '2025-05-01',
  until: '2025-05-08',
})

// Returns time series:
// [
//   { timestamp: '2025-05-01T00:00:00Z', value: 12 },
//   { timestamp: '2025-05-01T01:00:00Z', value: 15 },
//   ...
// ]

Per-agent cost breakdown

Use theazo.usage.forUser(userId) to get a detailed cost breakdown for a user. The response breaks down spending into model (LLM inference cost), compute (sandbox compute time), and total (combined). All amounts are integer cents. The billingMode field indicates who pays for what — in BYOI modes, the corresponding cost field returns { amount: 0 } because the AgentCo pays the provider directly.

per-agent-cost.ts
const usage = await theazo.usage.forUser('user_123')

console.log(usage.model)        // { amount: 18, currency: 'usd' }  — $0.18
console.log(usage.compute)      // { amount: 24, currency: 'usd' }  — $0.24
console.log(usage.total)        // { amount: 42, currency: 'usd' }  — $0.42
console.log(usage.billingMode)  // 'managed' | 'byoi_compute' | 'byoi_models' | 'byoi_both'

// BYOI example: AgentCo brings their own compute (E2B/Fly credentials)
// → compute cost is 0 (they pay the provider directly)
const byoiUsage = await theazo.usage.forUser('user_byoi...')
console.log(byoiUsage.model)        // { amount: 18, currency: 'usd' }
console.log(byoiUsage.compute)      // { amount: 0, currency: 'usd' }
console.log(byoiUsage.total)        // { amount: 18, currency: 'usd' }
console.log(byoiUsage.billingMode)  // 'byoi_compute'

Traces

In orchestrator mode (Full Platform and Primitives-only), Theazo creates per-tool-call spans within each agent run. Every model call, tool invocation, and file operation is captured as a span with timing and metadata. In infra-only mode, traces are limited to compute-level events (boot, exec, destroy).

traces.ts
const { traces } = await theazo.traces.list({
  agentId: 'agt_...',
})

for (const trace of traces) {
  console.log(trace.traceId)     // 'trc_...'
  console.log(trace.spans)       // Span[]
  // Each span: { name, startTime, endTime, attributes, status }
}

Trace waterfall visualization

Traces render as a waterfall timeline in the dashboard, showing every operation in an agent run as a horizontal bar on a shared time axis. Each span represents one of four operation types: model call, tool execution, file I/O, or sandbox boot. Spans carry a name, duration, start/end timestamps, status (ok or error), and arbitrary metadata. Nested spans show parent-child relationships — for example, a model call that triggers a tool execution which performs file I/O.

Use theazo.traces.get(traceId) to fetch a full trace with nested spans:

trace-waterfall.ts
const trace = await theazo.traces.get('trc_...')

console.log(trace.traceId)    // 'trc_...'
console.log(trace.agentId)    // 'agt_...'
console.log(trace.duration)   // total duration in ms
console.log(trace.status)     // 'ok' | 'error'

// Spans are nested — each span can have children
for (const span of trace.spans) {
  console.log(span.name)       // 'model.call' | 'tool.exec' | 'file.read' | 'sandbox.boot'
  console.log(span.startTime)  // ISO 8601
  console.log(span.endTime)    // ISO 8601
  console.log(span.duration)   // ms
  console.log(span.status)     // 'ok' | 'error'
  console.log(span.metadata)   // { model: 'claude-sonnet-4-20250514', tokens: 1847, ... }
  console.log(span.children)   // Span[] — nested tool calls, file ops, etc.
}
The dashboard renders trace waterfalls as a visual timeline — click any agent run to see the full span breakdown with timing, status, and metadata for each operation.

Alerts

Theazo fires alerts for conditions that need attention. Alerts appear in the dashboard and can trigger webhooks.

  • Failed agents — an agent threw an unrecoverable error or exceeded max retries.
  • Cost limit warnings — a session has used 80% or more of its maxCost budget.
  • Long-running agents — an agent has been running for more than 30 minutes without completing.
alert-webhook.json
// Subscribe to alerts via webhook
// POST https://yourapp.com/webhooks/theazo
{
  "event": "agent.failed",
  "data": {
    "agentId": "agt_...",
    "sessionId": "ses_...",
    "error": "Max retries exceeded",
    "duration": 847,
    "cost": { "amount": 42, "currency": "usd" }
  }
}

Observability by mode

The depth of observability data depends on which mode you operate in. Full Platform and Primitives-only modes provide the richest data because Theazo controls the agent loop. Infra-only mode provides compute-level metrics only.

Capability
Full Platform
BYOI Primitives
Infra-only
Structured logs
Full
Full
Compute only
Tool-call spans
Yes
Yes
No
Model token tracking
Yes
Yes
No
Compute metrics
Yes
Yes
Yes
Alerts
Full
Full
Failures only
Observability works across all providers — one dashboard for E2B, Fly, and Docker. Logs, metrics, and traces are normalized into a single schema regardless of the underlying compute backend.

Method reference

theazo.logs.stream(filters)AsyncIterable<LogEntry>Stream logs in real time via WebSocket. Filter by session, agent, level.
theazo.logs.query(filters)Promise<{ logs, total }>Query historical logs with filters, pagination, and search.
theazo.metrics.query(opts)Promise<TimeSeries[]>Query metric time series by name, interval, and time range.
theazo.traces.list(filters)Promise<{ traces }>List traces with spans for a given agent or session.
theazo.traces.get(traceId)Promise<Trace>Fetch a single trace with all nested spans, timing, and metadata.
GET /v1/agents/:id/logs/streamSSE streamStream logs via Server-Sent Events. Event types: log, heartbeat.
theazo.usage.forUser(userId)Promise<UserUsage>Per-user cost breakdown: model, compute, total, and billing mode.
Was this page helpful?
Ask anything...⌘I