Outputs¶
Every entry point returns a StageContext, and writes a set of files to the run's output directory (<output_path>/<experiment_id>/). This page covers both.
The result context¶
After a run, read results off the returned context with ctx.get(...):
| Key | What it is |
|---|---|
total_latency | Total scheduled latency (cycles) for the workload. |
group_latencies | Per-fusion-group latency breakdown. |
scheduler | The SteadyStateScheduler - the full schedule and timing. |
workload | The parsed computation graph. |
accelerator | The parsed hardware model. |
ctx = optimize_allocation_co_generic(...)
print(ctx.get("total_latency")) # e.g. 14344.0
print(ctx.get("group_latencies"))
scheduler = ctx.get("scheduler")
Files written to disk¶
summary.yaml- a machine-readable summary of the run (e.g.total_latency, per-group latencies).- Visualizations (PNG) - workload graph, tiling, and the schedule, written into the run directory.
Schedule trace (Perfetto)¶
The schedule can be exported as a Perfetto JSON trace and opened at https://ui.perfetto.dev to inspect each core's timeline and the inter-core transfers. See stream/visualization/ for the trace and plotting helpers.
Typed IR (for tools and agents)¶
For structured, JSON-serializable output, convert the context's objects into the typed IR models. These are the same models the MCP server returns:
from stream.ir import WorkloadIR, AcceleratorIR, AllocationIR
workload_ir = WorkloadIR.from_internal(ctx.get("workload"))
accelerator_ir = AcceleratorIR.from_internal(ctx.get("accelerator"))
allocation_ir = AllocationIR.from_internal(ctx.get("scheduler"))
allocation_data = allocation_ir.model_dump() # JSON-compatible dict
AllocationIR exposes persona views - .algorithmic_view(), .hardware_view(), .compiler_view() - each shaping the same result for a different consumer. The performance view surfaces bottleneck (compute- vs transfer-bound) cycles and per-node utilization. See Using Stream with an AI agent for details.