Get Started

Playground

A multi-tab developer console built into the Theazo dashboard. Test every primitive — Chat, Agents, Workflows, Fleets, Schedules, and Tools — interactively, without writing backend code. Results stream in real time over SSE.

The Playground always runs against a development session (environment: 'development'). Dev sessions are tracked but not billed for compute. Model token costs still apply if you use managed models.

Overview

The Playground is split into two vertical panels. The left panel is the config panel — shared fields (definition, model, system prompt, tools) plus tab-specific options. The right panel is the output panel — streaming results, logs, cost breakdown, and artifacts.

Six tabs run across the top. Each tab is a different execution mode. Switching tabs does not clear your config; shared fields persist. Tab-specific fields reset to defaults when you switch.

ChatMulti-turn conversation testing with configurable context strategies and conversation history.
AgentOne-shot task execution. Set a cost cap and timeout, inspect artifacts, see per-step tool calls.
WorkflowDAG visualization with step cards. Supports planner-dynamic steps and inline approval UI.
FleetBatch input builder. Configure concurrency, watch a live progress bar, inspect per-item results.
ScheduleCron expression editor with human-readable preview. Create webhook triggers with auto-generated cURL.
ToolsBrowse built-in and MCP tools, fill a JSON input form, execute directly, inspect structured output.

Session lifecycle

When you open the Playground, a development session is auto-created for your platform. The session ID is displayed in the top-right corner of the config panel. It is pinned for the duration of your browser session.

Changing the definition or model triggers a new session. The old session is destroyed (sandbox torn down, resources released). This ensures isolation between different configurations and keeps dev costs near zero.

session-lifecycle.ts
// What happens behind the scenes when you click Run in the Playground

// 1. Browser calls Next.js API route (API key stays server-side)
POST /api/playground/run

// 2. Next.js route calls Hono API with your platform key
POST https://api.theazo.com/v1/sessions
{ "userId": "playground_dev", "environment": "development" }

// 3. Agent/workflow/fleet runs inside that session
// 4. Results stream back via SSE → browser renders tokens word-by-word
// 5. Session is destroyed automatically after 30 min of inactivity

Real API proxy

The Playground never exposes your API key to the browser. All requests flow through a Next.js API route that injects the key server-side:

apps/web/app/api/playground/run/route.ts
// Next.js API route — API key never reaches the browser
export async function POST(req: Request) {
  const body = await req.json()

  const upstream = await fetch('https://api.theazo.com/v1/...', {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${process.env.THEAZO_API_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify(body),
  })

  // Stream SSE events directly back to browser
  return new Response(upstream.body, {
    headers: { 'Content-Type': 'text/event-stream' },
  })
}

SSE streaming

Every tab streams results via Server-Sent Events. The output panel updates incrementally — no polling, no page reloads. What streams depends on the tab:

Chattoken
Each model token appears word-by-word as it is generated. Tool calls render as collapsible blocks inline.
Agentstep
Each tool call emits a step event: tool name, input, output, duration, cost. Final answer renders as markdown.
Workflowstep_started / step_completed
Step cards in the DAG visualization light up as steps start and complete. Approval steps show an inline approve/deny button.
Fleetitem_progress
Progress bar advances per item. Each row in the results table fills in with status, cost, and output as items complete.
Schedulefire_result
Fire Now sends one immediate invocation and streams its output exactly like the Agent tab.
Toolstool_result
Structured JSON output renders in a syntax-highlighted block. Errors show the raw exception.

Tab reference

Chat tab

The Chat tab is a full multi-turn conversation interface. The output panel shows the conversation thread with user messages on the right and assistant messages on the left. Tool calls appear as collapsible blocks between turns.

Tab-specific config fields:

Context strategyfull_history keeps every message. sliding_window trims oldest turns when context fills. summary compresses old turns via a summarizer agent.
Max turnsStop the conversation after N turns. Useful for automated test loops.
User personaOptional mock user identity (name, locale, timezone) injected into the system prompt for persona testing.

Press Enter to send a message. Cmd+K clears the conversation history without creating a new session.

Agent tab

The Agent tab runs a single task end-to-end. The output panel has four sub-panels: Output (markdown), Logs (real-time tool call log), Artifacts (downloadable files created by the agent), and Cost (compute + model + storage breakdown).

Tab-specific config fields:

TaskThe instruction passed to agent.run(). Supports {{ variable }} template syntax if the definition has input schema fields.
Cost capMaximum spend in cents for this run. Agent is terminated if the limit is hit. Maps to costCap in the run options.
TimeoutWall-clock timeout in seconds before the run is cancelled. Default 120s for dev sessions.
Save artifactsToggle whether files written to the sandbox are persisted to R2 after the run ends.

Press Cmd+Enter to run. The Run button shows a live elapsed timer and disables itself until the agent completes or is cancelled.

Workflow tab

The Workflow tab renders the DAG visually. Steps are displayed as cards connected by directed edges. Parallel groups are shown side-by-side. As a run progresses, cards highlight: grey (pending), amber (running), green (completed), red (failed).

Click any step card to expand it and see its output, cost, and duration inline. Approval steps show an Approve / Deny button directly in the card — no need to navigate to the Approvals page during testing.

Workflows with type: 'planner' steps show dynamically-generated sub-steps as they are created by the planner agent. The DAG expands in real time as new steps are emitted.

Tab-specific config fields: workflow ID (select from your defined workflows) and input JSON (the input object passed to workflow.run()).

Fleet tab

The Fleet tab dispatches the same task across many inputs simultaneously. The left panel has an input builder: paste a JSON array or upload a CSV, and each row becomes one fleet item. A concurrency slider controls how many agents run in parallel (1–50 for dev sessions).

The output panel shows a progress bar (X / N completed) and a results table. Each row is one item: status dot, item index, cost, duration, and a collapsed output preview. Click any row to expand the full output for that item.

Tab-specific config fields:

Input arrayJSON array of objects. Each object is passed as the input to one agent run.
ConcurrencyMaximum simultaneous agents. Higher values finish faster but use more compute quota.
Stop on errorIf any item fails, cancel remaining items. Defaults to false (continue on error).

Schedule tab

The Schedule tab has two modes: Cron and Webhook.

In Cron mode, a cron expression input shows a human-readable preview below it ("Every Monday at 9:00 AM UTC"). A timezone selector (IANA format) adjusts the display. A Fire Now button sends one immediate invocation and streams results in the output panel exactly like the Agent tab. This lets you test the scheduled task before enabling the cron.

In Webhook mode, the tab generates a unique webhook endpoint URL and shows the matching cURL command. Sending an HTTP request to that URL triggers an agent run. The Playground listens for the inbound event and streams the result in real time, so you can test your webhook integration from a terminal.

webhook-trigger.sh
# Auto-generated cURL shown in the Playground Schedule tab
curl -X POST https://api.theazo.com/v1/triggers/whk_abc123/fire \
  -H "Content-Type: application/json" \
  -d '{ "event": "order.created", "orderId": "ord_9x2" }'

# Agent runs immediately; Playground streams the result

Tools tab

The Tools tab lets you execute any individual tool in isolation — without running a full agent. The left panel is a tool browser: built-in tools (web_search, read_file, write_file, bash, browser) are listed at the top. MCP servers registered to your platform appear below, with their tool list expanded.

Select a tool to load its JSON schema into the input form. Fill the form fields and click Execute. The result appears in the output panel as syntax-highlighted JSON. Errors show the raw exception message and stack so you can debug MCP server issues directly.

The tool browser also shows which tools are currently attached to the selected agent definition, with a toggle to add or remove them without leaving the Playground.

Config panel

These fields appear in the config panel across all tabs:

DefinitionAgent definition from your Agent Store. Loads the definition's model, prompt, and tools as defaults.
ModelOverride the definition's model. Shows cost-per-1M-tokens beside each option.
System promptOverride the definition's system prompt. Supports {{ variable }} template substitution.
ToolsMulti-select from all available tools (built-in + MCP). Overrides the definition's tool list.
KnowledgeAttach a knowledge collection for RAG. The agent queries it automatically on each run.
GuardrailsContent filter level (off / moderate / strict), PII blocking, and domain allow-list.

Save as Definition

Once you've tuned a configuration in the Playground, click Save as Definition in the top-right of the config panel. This writes the current model, prompt, tools, guardrails, and knowledge settings to the Agent Store as a new definition (or updates an existing one if you loaded one at the start).

Saved definitions are immediately available in production via the SDK:

use-saved-definition.ts
import { Theazo } from 'theazo'

const theazo = new Theazo({ apiKey: process.env.THEAZO_API_KEY! })
const session = await theazo.sessions.forUser('user_123')

// Use the definition you saved from the Playground
const agent = await session.agents.create({ definition: 'lead-researcher' })
const result = await agent.run('Analyze Stripe as a competitor')

Every entity in the dashboard has a Test button that opens the Playground with that entity's config pre-loaded:

Agent StoreAgent tab
Loads the definition's model, prompt, and tools. You can run a task immediately.
Workflows listWorkflow tab
Loads the workflow ID and pre-fills the input JSON with the workflow's input schema defaults.
Schedules listSchedule tab
Loads the cron expression and agent definition. Fire Now runs one immediate invocation.
Tools listTools tab
Selects the tool in the browser and loads its input schema into the form.

Deep-links use URL query params: /playground?tab=agent&definition=lead-researcher. You can bookmark or share these links with teammates.

Run history

The Playground persists your last 100 runs in localStorage. Click the History button in the top-right to open the history drawer. Each entry shows:

  • Tab type (Chat / Agent / Workflow / Fleet / Schedule / Tools)
  • Definition name or tool name used
  • Run timestamp and duration
  • Total cost in cents (model + compute)
  • Final status (completed / failed / cancelled)

Click any history entry to restore the full config and output into the current tab. This makes it easy to re-run a previous configuration with small tweaks.

Keyboard shortcuts

Cmd+1 — Cmd+6Switch to tab 1 through 6 (Chat, Agent, Workflow, Fleet, Schedule, Tools)
EnterSend message (Chat tab only)
Cmd+EnterRun the current task / workflow / fleet / schedule / tool
Cmd+KClear output panel and conversation history (does not destroy the session)
EscapeCancel an in-progress run
Cmd+SSave current config as Definition (opens save dialog)
Cmd+HToggle history drawer

Cost control

Dev sessions use environment: 'development'. Compute time in dev sessions is not billed. Model token costs still apply when using managed models (Claude, GPT-4o, Gemini). To zero out all costs during testing:

  • Use a BYOI model key (your own Anthropic/OpenAI key) so model costs hit your provider account, not Theazo billing
  • Set a cost cap per run in the Agent tab to prevent runaway tasks
  • Fleet concurrency defaults to 3 in dev sessions to prevent accidental large batch runs

The cost breakdown in the output panel updates in real time as the run progresses. It shows compute, model tokens (input + output separately), and storage. Costs are displayed in cents with Geist Mono font for precision.

Fleet runs with high concurrency and large input arrays can still accumulate significant model token costs in dev sessions. Always set a reasonable cost cap when testing fleets with more than 20 items.
Was this page helpful?
Ask anything...⌘I