Using Stream with an AI agent¶
Stream is built to be driven by an AI coding agent (such as Claude Code) as well as by a human. There are two layers of programmatic support: an MCP server that lets an agent run and inspect optimizations as a tool, and typed IR models for structured results.
The MCP server¶
Stream ships an MCP (Model Context Protocol) server so an agent can submit and inspect TETRA constraint-optimization jobs as tool calls. It needs the [mcp] extra:
pip install -e ".[mcp]"
Launch it (STDIO / JSON-RPC transport) from the repo root:
python3 -c "from stream.mcp.server import mcp; mcp.run(transport='stdio')"
The server (stream/mcp/server.py, name stream) exposes six tools:
| Tool | Purpose |
|---|---|
run_optimization(hardware, workload, mapping, output_path, backend, ...) | Submit a CO job; returns a job_id immediately and solves in the background. |
poll_optimization(job_id) | Check status: pending / running / complete / failed / not_found. |
get_workload_ir(workload=None, experiment_id=None) | The workload graph as WorkloadIR JSON. |
get_accelerator_ir(hardware=None, experiment_id=None) | The hardware model as AcceleratorIR JSON. |
get_allocation_ir(job_id) | The allocation result as AllocationIR JSON (three persona views). |
get_solve_stats(job_id) | MILP solve statistics (objective, time, gap, node count, backend). |
Typical flow: run_optimization(...) → poll poll_optimization(job_id) until complete → inspect with get_allocation_ir(job_id) and get_solve_stats(job_id).
Typed IR for structured results¶
When an agent (or any tool) needs machine-readable output instead of console text, use the IR models. They wrap the internal objects and serialize cleanly to JSON:
from stream.ir import WorkloadIR, AcceleratorIR, AllocationIR
# after running optimize_allocation_co_generic(...) -> ctx
workload_ir = WorkloadIR.from_internal(ctx.get("workload"))
accelerator_ir = AcceleratorIR.from_internal(ctx.get("accelerator"))
allocation_ir = AllocationIR.from_internal(ctx.get("scheduler"))
allocation_data = allocation_ir.model_dump() # JSON-compatible dict
AllocationIR offers persona-specific views of the same result - .algorithmic_view(), .hardware_view(), .compiler_view() - so a consumer asks for the shape it needs (the algorithmic graph, the hardware placement, or the compiler-facing detail).
In short¶
- Running optimizations as a tool? Use the MCP server.
- Consuming results programmatically? Use the IR models.