CLI Reference
Boilermaker ships a boilermaker command for inspecting and managing TaskGraph state directly from Azure Blob Storage and Azure Service Bus. It is useful for diagnosing stalled graphs in production without touching application code.
Installation
The CLI is included when you install the package:
Global options
--storage-url and --container are required by all commands and go before the subcommand name. They can also be provided via environment variables:
| Option | Env var | Description |
|---|---|---|
--storage-url URL |
BOILERMAKER_STORAGE_URL |
Azure Blob Storage account URL |
--container NAME |
BOILERMAKER_CONTAINER |
Blob container name |
-v / --verbose |
— | Enable debug logging |
--no-color |
NO_COLOR |
Disable colored output |
boilermaker \
--storage-url "$AZURE_STORAGE_URL" \
--container "$CONTAINER_NAME" \
<subcommand> [subcommand options]
Setting BOILERMAKER_STORAGE_URL and BOILERMAKER_CONTAINER in your environment lets you omit those flags entirely.
Authentication
The CLI uses DefaultAzureCredential for blob storage access. Make sure your environment is authenticated (e.g. via az login, a managed identity, or environment variables).
Commands that publish to Service Bus (recover, invoke) also use DefaultAzureCredential for Service Bus access.
Commands
inspect
Print a rich status panel and task table for a TaskGraph. Pass --task to drill into a single task.
boilermaker --storage-url <url> --container <name> inspect \
--graph <graph_id> \
[--task <task_id>] \
[--json]
Options
| Option | Required | Description |
|---|---|---|
--graph GRAPH_ID |
Yes | The ID of the graph to inspect |
--task TASK_ID |
No | Inspect a single task (requires --graph) |
--json |
No | Output machine-readable JSON instead of formatted output |
Example — inspect a graph
boilermaker \
--storage-url "$AZURE_STORAGE_URL" \
--container "$CONTAINER_NAME" \
inspect --graph "$GRAPH_ID"
Output (color-coded in a real terminal):
╭─────────────────────────────────────────────╮
│ Graph: 019d8c0c-bd9b-7c23-be84-4d0799d7ecd4 │
│ Status: In Progress │
│ Complete: 2/3 │
│ Failures: 0 Stalled: 1 │
╰─────────────────────────────────────────────╯
Task ID (short) Function Status Type
────────────── ────────────────── ────────── ──────────
be84-4c416d85 fetch_data ✓ Success child
be84-4c5388b8 process_report ⚠ Retry child STALLED
be84-4c6a1b2c send_notification · Pending child
A task is considered stalled if its status is Scheduled, Started, or Retry — states that indicate the task was dispatched but no worker has written a terminal result.
Example — inspect a single task
boilermaker \
--storage-url "$AZURE_STORAGE_URL" \
--container "$CONTAINER_NAME" \
inspect --graph "$GRAPH_ID" --task "$TASK_ID"
Output:
╭─── Task Detail ────────────────────────────────────────────╮
│ Task ID: 019d8c0c-bd9b-7c23-be84-4c5388b8de6c │
│ Function: process_report │
│ Status: ⚠ Retry STALLED │
│ Type: child │
│ Graph: 019d8c0c-bd9b-7c23-be84-4d0799d7ecd4 │
│ │
│ Dependencies: fetch_data │
│ Dependents: send_notification │
╰────────────────────────────────────────────────────────────╯
Example — JSON output
boilermaker \
--storage-url "$AZURE_STORAGE_URL" \
--container "$CONTAINER_NAME" \
inspect --graph "$GRAPH_ID" --json
{
"graph_id": "019d8c0c-bd9b-7c23-be84-4d0799d7ecd4",
"is_complete": false,
"has_failures": false,
"stalled_count": 1,
"tasks": [
{
"task_id": "019d8c0c-bd9b-7c23-be84-4c416d85779d",
"function_name": "fetch_data",
"status": "Success",
"type": "child",
"is_stalled": false,
"depends_on": []
}
],
"fail_tasks": []
}
Exit codes
| Code | Meaning |
|---|---|
0 |
Graph is healthy — no stalled tasks |
1 |
One or more stalled tasks found |
2 |
Error (graph not found, missing arguments, etc.) |
inspect is safe to use in scripts and health checks:
boilermaker \
--storage-url "$AZURE_STORAGE_URL" \
--container "$CONTAINER_NAME" \
inspect --graph "$GRAPH_ID"
if [ $? -eq 1 ]; then
echo "Graph has stalled tasks — consider running recover"
fi
recover
Re-publish all stalled tasks for a graph to Service Bus. Each recovery message uses a unique message ID (<task_id>:recovery:<timestamp>) so it bypasses Service Bus duplicate detection.
boilermaker --storage-url <url> --container <name> recover \
--graph <graph_id> \
--sb-namespace-url <url> \
--sb-queue-name <name>
Options
| Option | Required | Description |
|---|---|---|
--graph GRAPH_ID |
Yes | The ID of the graph to recover |
--sb-namespace-url URL |
Yes | Service Bus namespace URL (e.g. https://myns.servicebus.windows.net) |
--sb-queue-name NAME |
Yes | Service Bus queue name |
Example
boilermaker \
--storage-url "$AZURE_STORAGE_URL" \
--container "$CONTAINER_NAME" \
recover \
--graph "$GRAPH_ID" \
--sb-namespace-url "$SERVICE_BUS_NAMESPACE_URL" \
--sb-queue-name "$QUEUE_NAME"
Output:
✓ Recovered: be84-4c5388b8 (process_report) — msg_id: 019d...:recovery:1713095548
✗ Failed: be84-4c6a1b2c (send_notification) — <error message>
Exit codes
| Code | Meaning |
|---|---|
0 |
All stalled tasks recovered successfully |
1 |
One or more tasks failed to recover |
2 |
Error (graph not found, missing arguments, etc.) |
Recovery re-executes tasks
Recovery publishes the task again to the Service Bus queue. If the original task execution is still running (e.g. a very slow worker), both copies may run concurrently. Ensure your task handlers are idempotent before using recover.
purge
Delete old task-result blobs from Azure Blob Storage. Graphs with in-progress tasks are automatically skipped.
boilermaker --storage-url <url> --container <name> purge \
--older-than <days> \
[--dry-run] \
[--force] \
[--all-graphs]
Options
| Option | Required | Description |
|---|---|---|
--older-than DAYS |
Yes | Delete graphs older than DAYS days, based on created_date tags (1–30 inclusive) |
--dry-run |
No | Print what would be deleted without deleting any blobs |
--force |
No | Also delete graphs that have in-progress tasks |
--all-graphs |
No | Discover graphs by UUID7 timestamp instead of tag index — use for containers with pre-tag blobs |
Example — dry run first, then execute
# Step 1: preview what will be deleted
boilermaker \
--storage-url "$AZURE_STORAGE_URL" \
--container "$CONTAINER_NAME" \
purge --older-than 7 --dry-run
# Step 2: execute after confirming the plan
boilermaker \
--storage-url "$AZURE_STORAGE_URL" \
--container "$CONTAINER_NAME" \
purge --older-than 7
Dry-run output
Purge plan: graphs with created_date before 2026-04-07 (older than 7 days)
Graph ID Blobs
────────────────────────────────────────── ─────
019d8c0c-bd9b-7c23-be84-4d0799d7ecd4 12
019d8c0c-bd9b-7c23-be84-4d0799d7ecd5 3
[DRY RUN] No blobs were deleted.
Graphs with in-progress tasks are printed to stderr and excluded from the plan:
Post-deletion output
Exit codes
| Code | Meaning |
|---|---|
0 |
Success — no errors, no skipped graphs (or dry-run completed) |
1 |
One or more graphs skipped due to in-progress tasks |
2 |
Unrecoverable error (auth failure, container not found, all deletions failed) |
Deletion is irreversible
Deleted blobs cannot be recovered unless Azure soft-delete is enabled on the storage account. Always run with --dry-run first to confirm the scope. Concurrent purge invocations against the same container are safe — any blob already deleted by a concurrent process returns a 404, which is treated as a no-op.
invoke
Publish a single task to Service Bus. Useful for manually triggering a specific task without re-running the whole graph.
boilermaker --storage-url <url> --container <name> invoke <task_id> \
--graph <graph_id> \
--sb-namespace-url <url> \
--sb-queue-name <name> \
[--force]
Options
| Option | Required | Description |
|---|---|---|
TASK_ID |
Yes | Positional — the task ID to invoke |
--graph GRAPH_ID |
Yes | The graph the task belongs to |
--sb-namespace-url URL |
Yes | Service Bus namespace URL |
--sb-queue-name NAME |
Yes | Service Bus queue name |
--force |
No | Allow re-invocation of tasks already in a terminal state |
Example
boilermaker \
--storage-url "$AZURE_STORAGE_URL" \
--container "$CONTAINER_NAME" \
invoke "$TASK_ID" \
--graph "$GRAPH_ID" \
--sb-namespace-url "$SERVICE_BUS_NAMESPACE_URL" \
--sb-queue-name "$QUEUE_NAME"
Output:
If the task is already in a terminal state (Success, Failure, RetriesExhausted, etc.) without --force:
Exit codes
| Code | Meaning |
|---|---|
0 |
Task published successfully |
2 |
Error (graph/task not found, terminal state without --force, publish failed) |
Invoke re-executes a task
invoke publishes the task to the queue without resetting its result blob. If the worker picks it up, it will overwrite the existing result. Use --force intentionally and ensure your task handler is idempotent.