Skip to content

docs(orchestration): add multi-agent orchestration pages#84

Open
hongyi-chen wants to merge 3 commits into
hyc/orchestration-launchfrom
hyc/orch/multi-agent
Open

docs(orchestration): add multi-agent orchestration pages#84
hongyi-chen wants to merge 3 commits into
hyc/orchestration-launchfrom
hyc/orch/multi-agent

Conversation

@hongyi-chen
Copy link
Copy Markdown
Collaborator

Summary

Adds documentation for multi-agent orchestration (Codename Maestro): a new concept page describing the parent/child model and supported patterns, and a how-to page covering every way to start an orchestrated run (slash command, Oz CLI, Oz web app, REST API). Cross-references are wired into the deployment-patterns, managing-cloud-agents, oz-web-app, and agent-notifications pages so an existing reader can find the new content from the surfaces they already use.

Pages

New:

  • src/content/docs/agent-platform/cloud-agents/orchestration/index.mdx — concepts: parent/child model, local/cloud combinations, lifecycle event types, messaging API, common patterns (supervisor/worker, fan-out, critic, review swarm, DAG, swarm), approval (orchestrate) mode, observability.
  • src/content/docs/agent-platform/cloud-agents/orchestration/multi-agent-runs.mdx — how-to: /orchestrate slash command, oz agent run-cloud (agent-driven and script-driven), Oz web app entry points, POST /agent/runs with mode: orchestrate or parent_run_id, SSE + batch poll for lifecycle events, POST /agent/messages, fleet cancellation.

Edited:

  • src/content/docs/agent-platform/cloud-agents/deployment-patterns.mdx — replaced the inline 'fan-out parallel work (sharding)' recipe with a link to the new orchestration pages.
  • src/content/docs/agent-platform/cloud-agents/managing-cloud-agents.mdx — added an 'Orchestrated runs (parent and child)' section describing how the hierarchy is rendered in the management view.
  • src/content/docs/agent-platform/cloud-agents/oz-web-app.mdx — added an 'Orchestrated runs' section covering where to start an orchestration, the live detail pane, and the post-execute roll-up.
  • src/content/docs/agent-platform/capabilities/agent-notifications.mdx — added a 'Notifications in orchestrated runs' section describing child lifecycle events, child messages, and blocked-child propagation.

Research grounding

  • /orchestrate slash command behavior: app/src/search/slash_command_menu/static_commands/commands.rs (gated by FeatureFlag::Orchestration, in PREVIEW_FLAGS).
  • Lifecycle event names: warp-server/logic/agent_lifecycle.go (run_in_progress, run_succeeded, run_failed, run_errored, run_blocked, run_cancelled).
  • REST surfaces: warp-server/router/handlers/public_api/agent_messaging.go (messages, events, SSE stream). parent_run_id field on RunAgentRequest/RunItem from warp-server/public_api/openapi.yaml.
  • Parent/child plumbing: warp-internal/app/src/ai/blocklist/action_model/execute/start_agent.rs, warp-internal/app/src/ai/agent/conversation.rs::orchestration_agent_id().
  • AgentRunMode enum (orchestrate vs plan): warp-server/public_api/openapi.yaml.

Open questions / needs PM confirmation

  • GA vs preview state of /orchestrate. It currently ships in PREVIEW_FLAGS (Preview/Friends-of-Warp), not in RELEASE_FLAGS. The new run-how-to page notes this with a :::note callout; please confirm the GA cutover plan so I can drop the note if it's already public at launch.
  • SSE stream availability. agent/events/stream is registered only on the RTC server tier and behind OrchestrationV2InfrastructureEnabled. I described it as supported with a 'real-time fan-out tier' caveat. Please confirm whether external API users get this endpoint at launch or whether we should call it 'preview only' for now.
  • oz run list --parent-run-id flag. The cancellation example uses this flag; the Oz CLI surface for filtering by parent run is implicit from the API but I did not verify a public CLI flag exists. Please confirm the exact CLI surface before launch.
  • Image assets. No screenshots are included for the management view's nesting chip or the web app's children/post-execute pane. Once design ships final visuals, we should add captures to src/assets/agent-platform/orchestration/.

Out of scope (per gating plan)

  • No edits to src/sidebar.ts (reserved for the cross-cutting sidebar PR).
  • No edits to vercel.json (reserved for the cross-cutting redirects PR).
  • No edits to top-level landing pages.
  • No edits to multi-harness pages (owned by another agent).

Artifacts

Co-Authored-By: Oz oz-agent@warp.dev

Add concept and run-how-to pages under cloud-agents/orchestration/.
Wire orchestration cross-references into deployment-patterns,
managing-cloud-agents, oz-web-app, and agent-notifications.

Co-Authored-By: Oz <oz-agent@warp.dev>
@cla-bot cla-bot Bot added the cla-signed label May 16, 2026
@vercel
Copy link
Copy Markdown

vercel Bot commented May 16, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
docs Error Error May 16, 2026 9:31pm

Request Review

@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 16, 2026

@hongyi-chen

I'm starting a first review of this pull request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR adds new orchestration concept and how-to documentation plus cross-links from existing agent platform pages.

Concerns

  • The CLI fan-out example captures a parent run ID but never uses it when launching children, so the example does not actually create an orchestration hierarchy.
  • The fleet cancellation example publishes a parent-run CLI filter that the PR description identifies as unverified; this should be confirmed or replaced with the supported API/CLI surface before launch.

Verdict

Found: 0 critical, 2 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

PARENT_RUN_ID=$(oz agent run-cloud --prompt "Coordinate the migration" --environment my-env --output-format json | jq -r .run_id)

for shard in pkg-a pkg-b pkg-c; do
oz agent run-cloud \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] This example captures PARENT_RUN_ID but never passes it to the child runs, so the launched runs are independent and will not appear nested under the parent. Add the supported parent-run flag here, or use the API parent_run_id field if the CLI cannot link children yet.

To clean up a whole orchestration, list children first and cancel each one:

```bash
oz run list --parent-run-id RUN_PARENT_ID --format json \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] The PR description says this --parent-run-id filter was not verified; shipping an unconfirmed CLI command will give users a broken fleet-cancellation path if the flag does not exist. Replace it with the confirmed CLI syntax or the supported API call before publishing.

- Replace broken CLI fan-out example (no `--parent-run-id` flag on
  `oz agent run-cloud`) with a curl-based scripted fan-out that uses the
  public `POST /agent/runs` endpoint with `parent_run_id`. Add a note
  pointing script-driven callers at the API.
- Replace the fleet-cancellation example: use `oz run list --ancestor-run`
  (supported) instead of the nonexistent `--parent-run-id` flag, and
  curl the public `POST /agent/runs/{runId}/cancel` endpoint instead of
  the nonexistent `oz run cancel` CLI command.
- Remove the SSE and `/agent/messages` / `/agent/events` API sections
  (still internal-only / dogfood-gated) and reframe lifecycle events
  and messaging conceptually.
- Add a new "Retrieving conversations and artifacts" section covering
  `GET /agent/runs?ancestor_run_id=`, `GET /agent/runs/{runId}`,
  `GET /agent/runs/{runId}/conversation`, and the `artifacts` field on
  RunItem for both parent and child runs.
- Fix `parent_run_id` query param to `ancestor_run_id` everywhere.
- Replace "service account" with "agent identity" in deployment-patterns.
- Senior-writer pass: tighten frontmatter descriptions, standardize
  em-dash usage in bold-term lists, ALL_CAPS placeholders, drop
  ambiguous modal verbs.

Addresses PR #84 review comments r3253425053 and r3253425056.

Co-Authored-By: Oz <oz-agent@warp.dev>
@hongyi-chen
Copy link
Copy Markdown
Collaborator Author

Revision pass

I addressed every actionable review comment and applied Slack-thread feedback. Summary of changes:

Review comments

  • r3253425053 — Fixed. oz agent run-cloud has no --parent-run-id flag, so the script-driven CLI fan-out couldn't link children to the parent. Replaced the example with two flows: (1) the recommended agent-driven oz agent run-cloud flow, and (2) a curl-based scripted fan-out using the public POST /agent/runs endpoint with parent_run_id. Added a :::note explaining script-driven parent linking requires the API.
  • r3253425056 — Fixed. Replaced the nonexistent oz run list --parent-run-id flag with the supported oz run list --ancestor-run, and the nonexistent oz run cancel CLI command with curl calls to the public POST /agent/runs/{runId}/cancel. Added a :::caution callout that self-hosted, local, and GitHub Action runs return 422.

Slack-thread + internal-only scrub

  • Removed the SSE stream and /agent/messages / /agent/events API sections (all marked x-internal: true in the OpenAPI spec, still gated behind dogfood flags) and reframed lifecycle events and messaging conceptually.
  • Removed internal type/field names (MessagesReceivedFromAgents, lifecycle_subscription) and infrastructure references (RTC / "real-time fan-out tier").
  • Fixed the API filter param: ?parent_run_id=?ancestor_run_id= everywhere. The field on a RunItem is parent_run_id; the supported list filter is ancestor_run_id.
  • Replaced "service account" with "agent identity" in deployment-patterns.mdx.
  • "Oz harness" noun phrase wasn't present; harness references already read as "harness" or "[harness] children."

New: conversations and artifacts in the API

Added a "Retrieving conversations and artifacts" section to multi-agent-runs.mdx covering GET /agent/runs?ancestor_run_id=, GET /agent/runs/{runId} (returns parent_run_id, conversation_id, artifacts), GET /agent/runs/{runId}/conversation, and GET /agent/runs/{runId}/transcript — all public endpoints — for both parent and child runs.

Senior-writer pass

Tightened frontmatter descriptions to single-sentence standalone summaries, standardized em-dash usage in bold-term lists, replaced ambiguous modal verbs ("may," "could"), switched placeholders to ALL_CAPS, and kept the :::note Preview callout for /orchestrate.

Validation

  • python3 .agents/skills/style_lint/style_lint.py --changed → 0 issues
  • python3 .agents/skills/check_for_broken_links/check_links.py --internal-only → 0 broken links

Open questions

None blocking. Optional follow-up: when the public /agent/messages and /agent/events endpoints leave dogfood, add a dedicated "Messaging API" section back to multi-agent-runs.mdx.


The supported event types are:

* **`run_in_progress`** — the run started executing (or restarted after being paused or blocked).
Copy link
Copy Markdown
Collaborator Author

@hongyi-chen hongyi-chen May 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to expose these to the users?

…essaging

- Drop "Each child run records its parent's run ID so the management
  view, API, and web app can render the hierarchy" implementation detail
  in favor of a user-facing sentence.
- Shrink "Messaging between agents" in orchestration/index.mdx to a
  two-sentence high-level overview (was 1 paragraph + 3 bullets + 1
  paragraph).
- Tighten the orchestration index opener, the Observability API bullet,
  and the approval-mode "before paying for it" phrasing.
- Drop the "tools available with model/harness" qualifier from the CLI
  section in multi-agent-runs.mdx.
- Collapse the three "Retrieving conversations and artifacts" subsections
  into a single section that states the response shape once.
- Trim oz-web-app.mdx (remove redundant Messages bullet, tighten
  post-execute summary) and managing-cloud-agents.mdx ("Each child is a
  full Oz run" framing).
- Simplify the Messages-from-children notification bullet.

Co-Authored-By: Oz <oz-agent@warp.dev>
@hongyi-chen
Copy link
Copy Markdown
Collaborator Author

Copy pass

Addressed the feedback in commit 97454cf:

  • Dropped implementation-detail framing. The "Each child run records its parent's run ID so the management view, API, and web app can render the hierarchy" sentence is gone, along with similar internals-leaking phrasings in multi-agent-runs.mdx (the parent_run_id linking explainer and the "orchestration tools available with model/harness" qualifier).
  • Shrunk "Messaging between agents" in orchestration/index.mdx from a paragraph + 3 bullets + a paragraph down to two high-level sentences. The detailed sender/recipient mechanics are gone; messaging now reads as a concept, not a spec.
  • General copy pass. Tightened the page opener, fixed "before paying for it," trimmed the Observability API bullet, collapsed the three "Retrieving conversations and artifacts" subsections into one section that states the response shape once, removed the now-redundant Messages bullet in oz-web-app.mdx, and dropped the "Each child is a full Oz run" framing from managing-cloud-agents.mdx.

Validation: style_lint --changed and check_links --internal-only both clean.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant