Skip to content

Fix mission workflow phase orchestration#313

Merged
arul28 merged 4 commits into
mainfrom
ade/missions-cua-3077a78a
May 16, 2026
Merged

Fix mission workflow phase orchestration#313
arul28 merged 4 commits into
mainfrom
ade/missions-cua-3077a78a

Conversation

@arul28
Copy link
Copy Markdown
Owner

@arul28 arul28 commented May 16, 2026

Summary

  • Normalize every mission workflow to include a Planning phase before execution, including custom/preflight-selected phase decks.
  • Honor customizable workflow paths: first post-planning routing now follows the configured phase order, mustPrecede is enforced at runtime, custom validation/review phases can run validation workers, and execution previews keep custom phases.
  • Tighten mission reliability: read-only Codex workers stay read-only, result reports can be recovered from step output, finalization waits for mission closeout, and duplicate completion syncs no longer race.
  • Improve ADE CLI mission/worker support for status/result/validation reporting and socket daemon version handling.

Audit notes

The audit compared ADE missions against the expected long-running mission shape: collaborative planning, worker execution, validation/review gates, adaptable prompts, and configurable paths. The concrete gaps fixed here were phase normalization, custom phase routing, ordering constraints, custom validation gates, read-only enforcement, and completion/finalization consistency.

Validation

  • npm --prefix apps/desktop run typecheck -- --pretty false
  • npm --prefix apps/desktop run lint
  • npm --prefix apps/desktop run test -- --shard=1/8
  • npm --prefix apps/desktop run test -- --shard=2/8
  • npm --prefix apps/desktop run test -- --shard=3/8
  • npm --prefix apps/desktop run test -- --shard=4/8
  • npm --prefix apps/desktop run test -- --shard=5/8
  • npm --prefix apps/desktop run test -- --shard=6/8
  • npm --prefix apps/desktop run test -- --shard=7/8
  • npm --prefix apps/desktop run test -- --shard=8/8
  • npm --prefix apps/desktop run build
  • npm --prefix apps/ade-cli run typecheck
  • npm --prefix apps/ade-cli run test
  • npm --prefix apps/ade-cli run build
  • npm --prefix apps/web run typecheck
  • npm --prefix apps/web run build
  • node scripts/validate-docs.mjs
  • git diff --check

Summary by CodeRabbit

  • New Features

    • Added CLI commands for reporting worker status, results, and validation checks.
    • Implemented planning phase requirement in mission workflows.
    • Enhanced phase transition and ordering enforcement during orchestration.
  • Bug Fixes

    • Improved parsing of legacy test result summaries with better numeric count extraction.
    • Added step output file recovery when transcript parsing fails.
    • Fixed version mismatch detection to allow placeholder versions.
    • Enhanced mission state synchronization with deduplication of concurrent calls.
  • Tests

    • Added comprehensive test coverage for phase management, worker coordination, and result reporting.

Review Change Stack

Greptile Summary

This PR hardens ADE mission workflow orchestration across three areas: phase normalization (planning phase is now always injected before any execution phase), custom-phase routing (first post-planning transition follows configured order, mustPrecede constraints are enforced bidirectionally, validation workers are permitted in any dedicated-gate phase), and reliability fixes (read-only Codex workers stay in a read-only sandbox, result reports can be recovered from step-output files with a legacy-path fallback, syncMissionFromRun replaces its stale syncLocks guard with a proper in-flight Promise, and finalizeRun waits for mission closeout before returning).

  • Phase normalization: ensurePlanningPhase is applied at every mission-launch and phase-override site; resolveDevelopmentPhaseKey is replaced by resolveFirstPostPlanningPhaseKey everywhere a coordinator needs the first post-planning destination.
  • Custom-phase routing: findPhaseByRef / resolveMustPrecedePredecessors are lifted into phaseEngine.ts and used from both coordinatorTools.ts and coordinatorAgent.ts; buildExecutionPlanPreviewFromPhases now iterates phase cards in position order rather than a hardcoded six-element list.
  • Reliability & CLI: syncMissionFromRun deduplicates concurrent callers via a syncInFlight Promise map; step-output recovery probes both the new step-output-{key}-attempt-{id}.md and the legacy step-output-{key}.md path; new report_status / report_result / report_validation CLI commands are wired up; version-mismatch detection skips the placeholder 0.0.0 build.

Confidence Score: 4/5

The core logic changes are well-tested and directionally correct, but the coordinator's read_step_output tool still does not probe the legacy step-output filename when a latest attempt is known, leaving active missions mid-deployment unable to retrieve their output files through that tool.

The phase-normalization, mustPrecede enforcement, and syncMissionFromRun deduplication changes are solid and backed by new tests. The read-only Codex sandbox fix closes a real security gap. The remaining concern is that read_step_output in coordinatorTools.ts only tries the new attempt-scoped filename when a latestAttempt is available and never falls back to the legacy format, whereas the recovery path in orchestratorService.ts correctly probes both. Any coordinator agent running against a worker whose output was written in the old format before this deployment would get a silent read failure from that tool, even though the file exists on disk.

apps/desktop/src/main/services/orchestrator/coordinatorTools.ts — the read_step_output handler's candidateRelativePaths should include the legacy path as a second candidate when latestAttempt is non-null, mirroring the fallback already present in extractReportResultPayloadFromWorkerStepOutput.

Important Files Changed

Filename Overview
apps/desktop/src/main/services/missions/phaseEngine.ts Adds ensurePlanningPhase, resolveFirstPostPlanningPhaseKey, findPhaseByRef, resolveMustPrecedePredecessors, and normalizePhaseRef — all with correct edge-case handling (planning-absent, -1 index, mustBeFirst clearing).
apps/desktop/src/main/services/orchestrator/orchestratorService.ts Adds two-phase result recovery (transcript then step-output file with legacy-path fallback via extractReportResultPayloadFromWorkerStepOutput) at both open-session and finalize-session call sites; new-format step-output path includes attempt ID. Recovery logic is correct, but the coordinator's read_step_output tool still omits the legacy-path fallback (noted in a prior review).
apps/desktop/src/main/services/orchestrator/aiOrchestratorService.ts Replaces syncLocks-based deduplication with a syncInFlight Promise map so concurrent callers serialise on the same run; adds retrospectiveStateSync before the post-finalize mission sync; adds runtime review_summary evidence collection; collects snake_case risk_notes alongside camelCase. syncLocks is still populated inside the IIFE but is never read for gating — it is now dead state.
apps/desktop/src/main/services/orchestrator/coordinatorTools.ts Replaces hard-coded "validation" phase guard with phaseAllowsValidationWorker so review/dedicated phases can spawn validation workers; roleName is now matched against the validation pattern unconditionally (previously required includeNameHeuristics); read_step_output probes the new attempt-scoped filename but still lacks a legacy-format fallback when latestAttempt is known (pre-existing gap).
apps/desktop/src/main/services/orchestrator/executionPolicy.ts Adds canUseValidationStepAsPhaseCompletionSurrogate to credit a passing validation step as evidence for required phases like codeReview/testReview; rewrites buildExecutionPlanPreviewFromPhases to iterate phase cards in position order rather than a fixed six-phase list, preserving custom workflow phases in previews.
apps/desktop/src/main/services/orchestrator/providerOrchestratorAdapter.ts Removes the allowReadOnlyControlPlaneFullAuto loophole that let read-only Codex workers use danger-full-access sandboxing; read-only workers now always get sandboxMode: "read-only". Adds model_reasoning_effort config when a reasoning level is resolved.
apps/desktop/src/main/services/adeActions/registry.ts Adds waitForMissionCloseoutAfterFinalize to both buildOrchestratorCoreDomainService and buildAiOrchestratorDomainService wrappers so callers block until the mission reaches a terminal status after finalization. These are independent wrappers over the same underlying service method, so there is no double-poll risk for a single call.
apps/ade-cli/src/adeRpcServer.ts Rewrites summarizeLegacyTestsRun to parse numeric counts ("12 passed, 1 failed") via parseLegacyTestCount before falling back to per-entry status; result type widens to Record<string, unknown> and includes raw text and commands.
apps/ade-cli/src/cli.ts Adds report_status, report_result, report_validation to worker mission tool names with appropriate argument parsing; introduces shouldReplaceMachineRuntimeVersion so the 0.0.0 placeholder version never triggers a daemon restart loop.
apps/ade-cli/src/bootstrap.ts Switches createMissionPreflightService from a type-only import to a value import and actually instantiates it in the ADE CLI runtime; adds ELECTRON_RUN_AS_NODE=1 to the generated shim scripts so the Electron binary runs as a plain Node.js process.

Sequence Diagram

sequenceDiagram
    participant C1 as Caller A
    participant C2 as Caller B
    participant S as syncMissionFromRun
    participant IF as syncInFlight Map
    participant DB as OrchestratorService / DB

    C1->>S: syncMissionFromRun(runId, "reason_a")
    S->>IF: get(runId) → undefined
    S->>IF: set(runId, executeSync Promise)
    S-->>S: launch IIFE (void async)
    S->>DB: getRunGraph / updateMission
    C2->>S: "syncMissionFromRun(runId, "reason_b", {nextMissionStatus})"
    S->>IF: get(runId) → executeSync (A's promise)
    C2-->>C2: await activeSync (blocked)
    DB-->>S: sync complete
    S->>IF: delete(runId)
    S-->>S: resolveSync() → executeSync resolves
    C1-->>C1: await executeSync returns
    C2-->>C2: activeSync resolved, wakes up
    alt C2 has nextMissionStatus
        C2->>S: "syncMissionFromRun(runId, "reason_b", {nextMissionStatus})"
        S->>IF: get(runId) → undefined (clear)
        S->>IF: set(runId, executeSync2 Promise)
        S->>DB: getRunGraph / updateMission (applies nextMissionStatus)
        DB-->>S: sync complete
        S->>IF: delete(runId)
        C2-->>C2: returns
    else no nextMissionStatus
        C2-->>C2: returns immediately
    end
Loading

Comments Outside Diff (1)

  1. apps/desktop/src/main/services/orchestrator/coordinatorTools.ts, line 6817-6855 (link)

    P1 No legacy-format fallback in read_step_output

    When latestAttempt is non-null (essentially always for any live step), candidateRelativePaths only contains the new step-output-{key}-attempt-{id}.md path. Workers that were already running before this PR deployed were instructed to write to the old path .ade/step-output-{key}.md. After deployment, the coordinator cannot read those files — it returns ok: false, error: "…" as if no output exists.

    The legacy fallback (.ade/step-output-${sanitized}.md) is only tried when latestAttempt is null, which won't occur for any in-progress mission. Adding it as a second candidate when latestAttempt is known would preserve backward compatibility for active missions mid-deployment.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: apps/desktop/src/main/services/orchestrator/coordinatorTools.ts
    Line: 6817-6855
    
    Comment:
    **No legacy-format fallback in `read_step_output`**
    
    When `latestAttempt` is non-null (essentially always for any live step), `candidateRelativePaths` only contains the new `step-output-{key}-attempt-{id}.md` path. Workers that were already running before this PR deployed were instructed to write to the old path `.ade/step-output-{key}.md`. After deployment, the coordinator cannot read those files — it returns `ok: false, error: "…"` as if no output exists.
    
    The legacy fallback (`.ade/step-output-${sanitized}.md`) is only tried when `latestAttempt` is `null`, which won't occur for any in-progress mission. Adding it as a second candidate when `latestAttempt` is known would preserve backward compatibility for active missions mid-deployment.
    
    How can I resolve this? If you propose a fix, please make it concise.

    Fix in Claude Code

Fix All in Claude Code

Prompt To Fix All With AI
Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 3
apps/desktop/src/main/services/orchestrator/aiOrchestratorService.ts:7010-7012
`syncLocks` is now written but never read for its original purpose. The concurrency guard was moved to the `syncInFlight` Promise map, so `syncLocks.add` / `syncLocks.delete` inside the IIFE are dead state. Leaving them in place is misleading — a future reader will assume `syncLocks` still prevents re-entrant syncs when it no longer does.

```suggestion
    void (async () => {
    try {
```

### Issue 2 of 3
apps/desktop/src/main/services/orchestrator/aiOrchestratorService.ts:7302-7306
The matching `syncLocks.delete` in the `finally` block is also dead after removing the `add` above. Keeping only `syncInFlight.delete` + `resolveSync()` makes the cleanup intent clear.

```suggestion
    } finally {
      syncInFlight.delete(runId);
      resolveSync();
    }
```

### Issue 3 of 3
apps/desktop/src/main/services/orchestrator/executionPolicy.ts:953-966
**`canUseValidationStepAsPhaseCompletionSurrogate` — custom-phase match is a substring search over all report fields**

For `codeReview` / `testReview` targets the function correctly requires both `"review summary"` and `"risk notes"` in the concatenated search text. For any other custom phase (e.g. a phase named `"release"`) only a single substring match against the normalized phase key / label is required — no stricter check is applied. A validation step whose summary says _"release gate validated"_ would therefore be credited as a surrogate for a `"release"` phase even if it was actually scoped to a different concern. If false positives in custom-phase surrogate matching are a concern, the `codeReview`/`testReview` multi-field requirement pattern could be extended to other custom phase keys as well.

Reviews (3): Last reviewed commit: "Preserve legacy worker step output recov..." | Re-trigger Greptile

@vercel
Copy link
Copy Markdown

vercel Bot commented May 16, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
ade Ignored Ignored Preview May 16, 2026 6:43am

@arul28
Copy link
Copy Markdown
Owner Author

arul28 commented May 16, 2026

@copilot review but do not make fixes

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 16, 2026

📝 Walkthrough

Walkthrough

This PR extends CLI worker mission tools (report_status, report_result, report_validation) with structured argument handling; introduces planning-phase enforcement throughout phase engine and mission creation; refactors coordinator phase dependency resolution to support flexible reference matching and mustPrecede constraints; enhances orchestrator closeout with review-summary and risk-notes evidence collection; and adds durable step-output recovery for worker results with placeholder filtering.

Changes

Worker Mission Tools, Phase Orchestration, and Orchestrator Closeout

Layer / File(s) Summary
CLI Worker Mission Tools and Test Result Parsing
apps/ade-cli/src/cli.ts, apps/ade-cli/src/adeRpcServer.ts, apps/ade-cli/src/bootstrap.ts, apps/ade-cli/src/cli.test.ts, apps/ade-cli/src/adeRpcServer.test.ts, apps/ade-cli/src/stdioRpcDaemon.test.ts
CLI now recognizes report_status, report_result, and report_validation worker mission tools, building RPC payloads from text/message/summary arguments; legacy test summarization parses numeric counts via regex; placeholder version gates runtime daemon restarts; missionPreflightService is instantiated and wired into AdeRuntime; integration tests verify legacy test normalization and placeholder-version behavior.
Phase Engine Planning Phase Enforcement
apps/desktop/src/main/services/missions/phaseEngine.ts, apps/desktop/src/main/services/missions/phaseEngine.test.ts, apps/desktop/src/main/services/missions/missionService.ts, apps/desktop/src/main/services/missions/missionPreflightService.ts
phaseEngine exports resolveFirstPostPlanningPhaseKey and ensurePlanningPhase to guarantee planning phase presence with proper ordering constraints; mission creation and missionPreflightService wrap phase selection with ensurePlanningPhase to enforce planning-phase requirement across all initialization paths; tests verify injection and phase resolution.
Coordinator Phase Dependency and Ordering Enforcement
apps/desktop/src/main/services/orchestrator/coordinatorAgent.ts, apps/desktop/src/main/services/orchestrator/coordinatorTools.ts, apps/desktop/src/main/services/orchestrator/coordinatorEventFormatter.ts, apps/desktop/src/main/services/orchestrator/coordinatorTools.test.ts
Coordinator introduces phase-reference matching and mustPrecede predecessor resolution helpers; phase ordering validation resolves mustFollow by flexible ref matching and enforces mustPrecede before phase start; spawn_worker blocks when target phase differs from current, returning phase-ordering block response; delegation tools use resolveFirstPostPlanningPhaseKey; validation-worker spawning checks if phase allows validation; tests extend assertions to structured blockedByPhaseOrdering payloads and planning-step metadata.
Orchestrator Mission Closeout and Evidence Collection
apps/desktop/src/main/services/orchestrator/aiOrchestratorService.ts, apps/desktop/src/main/services/adeActions/registry.ts, apps/desktop/src/main/services/orchestrator/aiOrchestratorService.test.ts
aiOrchestratorService derives next phase via resolveFirstPostPlanningPhaseKey; collects review_summary from step result/validation reports and runtime events (supporting both camelCase and snake_case); reads risk_notes (snake_case) conditionally based on mission success; deduplicates concurrent syncMissionFromRun calls via syncInFlight promise map; finalizeRun triggers post-success mission sync; adeActions registry introduces waitForMissionCloseoutAfterFinalize to poll mission status after run finalization.
Worker Result Recovery from Step Output Files
apps/desktop/src/main/services/orchestrator/orchestratorService.ts, apps/desktop/src/main/services/orchestrator/orchestratorService.test.ts
getWorkerStepOutputPath derives per-step step-output markdown file paths; looksLikePlanTemplatePlaceholder detects plan markdown placeholders; normalizeRecoveredWorkerResultReport filters plan/summary against placeholder patterns; onTrackedSessionEnded and completeAttempt attempt recovery from transcript first, then from durable step-output files; recovery flags distinguish source; tests verify step-output recovery and template-placeholder filtering.
Execution Policy Validation Surrogates and Codex Reasoning Config
apps/desktop/src/main/services/orchestrator/executionPolicy.ts, apps/desktop/src/main/services/orchestrator/executionPolicy.test.ts, apps/desktop/src/main/services/orchestrator/providerOrchestratorAdapter.ts, apps/desktop/src/main/services/orchestrator/orchestratorAdapters.test.ts
evaluateRunCompletionFromPhases supplements required phases with validation-surrogate steps when concrete success is missing; helpers normalize metadata and identify surrogates via phase references; buildExecutionPlanPreviewFromPhases preserves custom phase ordering via sorted position; providerOrchestratorAdapter derives reasoningEffort from step phaseModel and injects model-reasoning-effort config arguments; tests verify surrogate recognition, phase ordering, and read-only sandbox enforcement for Codex workers.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested labels

desktop

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 10.20% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Fix mission workflow phase orchestration' directly addresses the main change—normalizing mission workflows and enforcing phase orchestration—which is the primary focus of this comprehensive PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ade/missions-cua-3077a78a

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@capy-ai
Copy link
Copy Markdown

capy-ai Bot commented May 16, 2026

Capy auto-review is paused for this organization because the monthly auto-review limit has been reached. Increase the limit or turn it off in billing settings to resume automatic reviews.

Copy link
Copy Markdown
Contributor

Copilot AI commented May 16, 2026

@copilot review but do not make fixes

Reviewed commit f4e06a6 only (no fixes made, as requested). I did not find any new blocking issues in this change set.

Copilot finished work on behalf of arul28 May 16, 2026 05:59
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/desktop/src/main/services/orchestrator/coordinatorTools.ts (1)

3292-3300: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validator specialists can still be spawned outside validation-capable phases.

This path passes validationHeuristics: "prompt_only" and drops the explicit role, so a specialist role like validator/reviewer is never considered by looksLikeValidationWorkerRequest(). If the objective text avoids the prompt keywords, the coordinator can still launch a validator specialist during development. The same gap exists in the subagent delegation call sites below.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/desktop/src/main/services/orchestrator/coordinatorTools.ts` around lines
3292 - 3300, The validator-specialist spawn check is bypassed because
validateDelegationPromptForCurrentPhase is called with roleName: null and
validationHeuristics: "prompt_only", so looksLikeValidationWorkerRequest never
considers validator/reviewer roles; update the call sites (including the one
using validateDelegationPromptForCurrentPhase and the subagent delegation calls
below) to pass an explicit roleName of "validator" or "reviewer" (whichever
matches your role enum) when evaluating a validation specialist, or change
validationHeuristics to include role-based checks (e.g., "prompt_and_role") so
looksLikeValidationWorkerRequest can detect validator requests; ensure the same
fix is applied consistently to all matching calls in coordinatorTools.ts.
🧹 Nitpick comments (2)
apps/ade-cli/src/adeRpcServer.test.ts (2)

3497-3566: ⚡ Quick win

Consider strengthening the raw field assertion to verify all input strings are preserved.

The current assertion on line 3563 uses expect.stringContaining to verify that raw contains only one of the three input strings. If the normalization logic is intended to preserve all original legacy strings (not just the one with explicit counts), the test should verify that all three entries appear in the raw field to ensure complete legacy data retention and serve as clearer documentation of expected behavior.

♻️ Suggested enhancement to verify complete raw field preservation
       expect(response?.isError).toBeUndefined();
       expect(response.structuredContent.report.testsRun).toMatchObject({
         passed: 3,
         failed: 0,
         skipped: 0,
-        raw: expect.stringContaining("npm test (passed: 2 files, 3 tests)")
       });
+      expect(response.structuredContent.report.testsRun.raw).toContain("npm run typecheck (passed)");
+      expect(response.structuredContent.report.testsRun.raw).toContain("npm test (passed: 2 files, 3 tests)");
+      expect(response.structuredContent.report.testsRun.raw).toContain("ADE_PROJECT_ROOT=/tmp/app npm run build (passed)");
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/ade-cli/src/adeRpcServer.test.ts` around lines 3497 - 3566, Update the
test "normalizes legacy string test summaries for report_result" to assert that
the normalized testsRun.raw preserves all original input strings: after calling
callTool(handler, "report_result", ...) check
response.structuredContent.report.testsRun.raw includes each of the three
provided strings ("npm run typecheck (passed)", "npm test (passed: 2 files, 3
tests)", "ADE_PROJECT_ROOT=/tmp/app npm run build (passed)"). Modify the
expectation that currently uses expect.stringContaining for a single string to
verify all three substrings are present (e.g., three stringContaining checks or
a combined check) so the test ensures complete legacy-data retention for
testsRun.raw.

3551-3556: ⚖️ Poor tradeoff

Consider adding edge case tests for legacy test normalization.

This test covers the happy path where one entry contains explicit numeric counts. To ensure robust parsing behavior, consider adding test cases for:

  • All entries without explicit counts (to verify default behavior)
  • Multiple entries with explicit counts (to verify summing logic)
  • Entries with failed/skipped indicators (to verify failed and skipped count parsing)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/ade-cli/src/adeRpcServer.test.ts` around lines 3551 - 3556, Add
edge-case unit tests for the legacy test normalization logic that exercises the
testsRun parsing: create tests that (1) supply multiple entries with no explicit
numeric counts to assert defaults are applied, (2) supply multiple entries with
explicit numeric counts to assert counts are summed correctly across entries,
and (3) include entries containing failed/skipped indicators to assert failed
and skipped counts are parsed and included in the resulting summary; update the
existing assertions that inspect the testsRun output (the testsRun array used in
the current test) to validate total, passed, failed, and skipped counts for each
case.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@apps/ade-cli/src/cli.ts`:
- Around line 2270-2282: The status handler currently builds the fallback
message by filtering the raw args array before flag consumption, which lets
flags like --status or structured flags leak into the message; fix this in the
report_status branch by first reading known flags via readValue (e.g.
"--text","--message","--summary","--status") and then compute the fallback
message only from positional tokens remaining after removing all recognized
flags and their values (include "--arg" and "--arg-json" in the recognized set
and strip their following value(s) too), ignoring standalone "--", then pass
that cleaned message as summary/nextAction/details when calling
collectGenericObjectArgs so flag values are not included in the generated
message.
- Around line 10250-10255: The current shouldReplaceMachineRuntimeVersion guards
on runtimeVersion being truthy, so a missing/empty daemon version is treated as
compatible; change the logic in shouldReplaceMachineRuntimeVersion to consider a
null/undefined runtimeVersion as incompatible when the CLI has a real VERSION
(i.e., remove the truthy check and instead return VERSION !==
PLACEHOLDER_VERSION && runtimeVersion !== VERSION) so that the function returns
true when the CLI has a non-placeholder VERSION and the daemon version is absent
or different.

In `@apps/desktop/src/main/services/adeActions/registry.ts`:
- Around line 1995-2001: The current finalizeRun implementation passes a
possibly unresolved Promise (result) into waitForMissionCloseoutAfterFinalize,
so if runtime.aiOrchestratorService.finalizeRun or service.finalizeRun is async
the helper receives a Promise instead of the resolved finalized status; fix by
awaiting the finalize call first (await the result of either
runtime.aiOrchestratorService.finalizeRun(args) or service.finalizeRun(args)),
store the resolved value, then call await
waitForMissionCloseoutAfterFinalize(runtime, args.runId, resolvedResult) and
return the resolved value; reference functions: finalizeRun,
runtime.aiOrchestratorService.finalizeRun, service.finalizeRun, and
waitForMissionCloseoutAfterFinalize.
- Around line 1968-1972: The poll loop inside finalizeRun() currently awaits
runtime.missionService.get(missionId) directly so a transient rejection can
cause finalizeRun() to fail; wrap the get call in a try/catch and treat any
error as a best-effort stop/skip of the wait (e.g., log/debug and break out of
the loop or return) instead of rethrowing, keeping the existing timeout and
terminalMissionStatuses checks; update the block that calls
runtime.missionService.get(missionId) to catch errors and exit the polling loop
on failure.

In `@apps/desktop/src/main/services/orchestrator/aiOrchestratorService.test.ts`:
- Line 6698: The waitFor call awaiting mission completion uses the default 2s
timeout which causes CI flakes; update the await waitFor invocation (the call
that checks fixture.missionService.get(mission.id)?.status === "completed") to
pass an explicit longer timeout option (e.g., 10_000 ms or another appropriate
value) as the second argument so the test allows more time for async
finalization.

In `@apps/desktop/src/main/services/orchestrator/aiOrchestratorService.ts`:
- Around line 2994-3031: The code marks read-only worker runs as "planningLike"
which suppresses plain summary evidence; update the planningLike computation to
stop treating meta.readOnlyExecution as equivalent to a planning phase.
Specifically, modify the planningLike variable (currently checking
meta.readOnlyExecution === true, meta.phaseKey and meta.stepType) to only
consider explicit planning markers (meta.phaseKey and meta.stepType) and remove
the meta.readOnlyExecution check so summaries from read-only validation/review
phases are preserved when collecting details (affects uses in the blocks that
read report and validationReport).
- Around line 5453-5461: The success sync is being triggered before the
retrospective write completes, risking stale mission state; instead, create a
promise (e.g., retrospectiveStateSync) that awaits the
updateMissionStateDocument call (including latestRetrospective and reflections
via orchestratorService.listReflections) and catch/log errors using the existing
logger, then chain the syncMissionFromRun(runId, "finalize_run_succeeded") in a
.then() (or await) after retrospectiveStateSync so the retrospective persistence
completes before triggering syncMissionFromRun; reference
updateMissionStateDocument, latestRetrospective, retrospective,
orchestratorService.listReflections, retrospectiveStateSync, and
syncMissionFromRun in your fix.

In `@apps/desktop/src/main/services/orchestrator/coordinatorTools.ts`:
- Around line 2465-2473: The mustPrecede check uses
phaseHasSucceeded(predecessorPhase) which allows a predecessor to be considered
OK even if it skipped required validation gates; make this gating consistent
with set_current_phase by using
hasValidatedSuccessfulCompletion(predecessorPhase) (or otherwise ensure
validateRequiredValidationGates is expanded to include mustPrecede-linked
phases) when evaluating predecessors returned by
resolveMustPrecedePredecessors(sorted, currentPhase); update the for-loop that
iterates mustPrecedePredecessors to call hasValidatedSuccessfulCompletion and
return the same invalid/reason structure if that check fails so spawn gating and
phase-transition gating behave identically.
- Around line 1675-1680: phaseAllowsValidationWorker currently only accepts
exact step types "validation", "review", or "test_review", which excludes custom
review phases; update the check in phaseAllowsValidationWorker to treat any
resolved step type containing "review" or "validation" (e.g.,
stepType.includes("review") || stepType.includes("validation")) as
validation-capable, while preserving the existing fallback that inspects
phase.validationGate (gate.required and gate.tier === "dedicated"); reference
resolvePhaseStepType(phase.phaseKey), phase.phaseKey, phase.validationGate,
gate.required, and gate.tier when making the change.

In `@apps/desktop/src/main/services/orchestrator/executionPolicy.ts`:
- Around line 930-948: validationSurrogateSearchText currently only concatenates
free-text fields, so structured phase/report fields stored in step.metadata or
in meta.lastResultReport/meta.lastValidationReport are not indexed; update
validationSurrogateSearchText to also extract and include structured keys used
by the validation finalizer (e.g., targetPhase, phase, reviewEvidence or similar
fields present on meta, meta.lastResultReport and meta.lastValidationReport) by
converting them to strings (e.g., grab meta.targetPhase ||
lastResultReport.phase || lastValidationReport.phase and stringify any objects
or arrays) and append them to the values array before calling
normalizeSearchableText so canUseValidationStepAsPhaseCompletionSurrogate can
match structured phase/report data as well.
- Around line 1293-1297: The fallback loop that iterates phaseMap.keys() can
re-add the "planning" phase into phaseOrder; update that loop in
buildExecutionPlanPreviewFromPhases so it also skips the planning phase (e.g.,
if phaseName === "planning") just like the first loop does, i.e., continue when
seenPhases.has(phaseName) OR phaseName === "planning", so planning cards are
never pushed into phaseOrder as a fallback.

In `@apps/desktop/src/main/services/orchestrator/orchestratorService.ts`:
- Around line 1440-1443: The planSummary assignment currently treats
rawPlan?.summary as real text without excluding plan-template placeholders;
change the predicate to also call the plan-template placeholder detector (use
the same check used for plan.markdown, e.g.
!looksLikePlanTemplatePlaceholder(rawPlan.summary) or combine with
!looksLikeReportTemplatePlaceholder if both are relevant) so that planSummary
still requires a non-empty trimmed string and not a template placeholder; update
the same pattern wherever plan.summary is normalized (e.g., the other summary
handling around the planSummary/plan.markdown logic) to ensure placeholders are
never promoted into summary.
- Around line 331-334: getWorkerStepOutputPath currently scopes artifacts only
by stepKey, allowing retries to pick up stale outputs; change the function
signature to accept an attempt identifier (e.g., attemptId: string or number)
and include a sanitized attemptId in the filename (e.g.,
`step-output-${sanitizedStepKey}-attempt-${sanitizedAttempt}.md`), update the
sanitation logic to handle the attempt value, and update all callers of
getWorkerStepOutputPath (and the other similar occurrences around the codebase
that generate/read step-output-*.md) to pass the current attempt identifier so
recovered files are scoped per-attempt not per-step.

---

Outside diff comments:
In `@apps/desktop/src/main/services/orchestrator/coordinatorTools.ts`:
- Around line 3292-3300: The validator-specialist spawn check is bypassed
because validateDelegationPromptForCurrentPhase is called with roleName: null
and validationHeuristics: "prompt_only", so looksLikeValidationWorkerRequest
never considers validator/reviewer roles; update the call sites (including the
one using validateDelegationPromptForCurrentPhase and the subagent delegation
calls below) to pass an explicit roleName of "validator" or "reviewer"
(whichever matches your role enum) when evaluating a validation specialist, or
change validationHeuristics to include role-based checks (e.g.,
"prompt_and_role") so looksLikeValidationWorkerRequest can detect validator
requests; ensure the same fix is applied consistently to all matching calls in
coordinatorTools.ts.

---

Nitpick comments:
In `@apps/ade-cli/src/adeRpcServer.test.ts`:
- Around line 3497-3566: Update the test "normalizes legacy string test
summaries for report_result" to assert that the normalized testsRun.raw
preserves all original input strings: after calling callTool(handler,
"report_result", ...) check response.structuredContent.report.testsRun.raw
includes each of the three provided strings ("npm run typecheck (passed)", "npm
test (passed: 2 files, 3 tests)", "ADE_PROJECT_ROOT=/tmp/app npm run build
(passed)"). Modify the expectation that currently uses expect.stringContaining
for a single string to verify all three substrings are present (e.g., three
stringContaining checks or a combined check) so the test ensures complete
legacy-data retention for testsRun.raw.
- Around line 3551-3556: Add edge-case unit tests for the legacy test
normalization logic that exercises the testsRun parsing: create tests that (1)
supply multiple entries with no explicit numeric counts to assert defaults are
applied, (2) supply multiple entries with explicit numeric counts to assert
counts are summed correctly across entries, and (3) include entries containing
failed/skipped indicators to assert failed and skipped counts are parsed and
included in the resulting summary; update the existing assertions that inspect
the testsRun output (the testsRun array used in the current test) to validate
total, passed, failed, and skipped counts for each case.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 45925274-8cd9-4e50-93f0-8d2a24271372

📥 Commits

Reviewing files that changed from the base of the PR and between 88fa9cd and f4e06a6.

⛔ Files ignored due to path filters (1)
  • docs/features/missions/README.md is excluded by !docs/**
📒 Files selected for processing (23)
  • apps/ade-cli/src/adeRpcServer.test.ts
  • apps/ade-cli/src/adeRpcServer.ts
  • apps/ade-cli/src/bootstrap.ts
  • apps/ade-cli/src/cli.test.ts
  • apps/ade-cli/src/cli.ts
  • apps/ade-cli/src/stdioRpcDaemon.test.ts
  • apps/desktop/src/main/services/adeActions/registry.ts
  • apps/desktop/src/main/services/missions/missionPreflightService.ts
  • apps/desktop/src/main/services/missions/missionService.ts
  • apps/desktop/src/main/services/missions/phaseEngine.test.ts
  • apps/desktop/src/main/services/missions/phaseEngine.ts
  • apps/desktop/src/main/services/orchestrator/aiOrchestratorService.test.ts
  • apps/desktop/src/main/services/orchestrator/aiOrchestratorService.ts
  • apps/desktop/src/main/services/orchestrator/coordinatorAgent.ts
  • apps/desktop/src/main/services/orchestrator/coordinatorEventFormatter.ts
  • apps/desktop/src/main/services/orchestrator/coordinatorTools.test.ts
  • apps/desktop/src/main/services/orchestrator/coordinatorTools.ts
  • apps/desktop/src/main/services/orchestrator/executionPolicy.test.ts
  • apps/desktop/src/main/services/orchestrator/executionPolicy.ts
  • apps/desktop/src/main/services/orchestrator/orchestratorAdapters.test.ts
  • apps/desktop/src/main/services/orchestrator/orchestratorService.test.ts
  • apps/desktop/src/main/services/orchestrator/orchestratorService.ts
  • apps/desktop/src/main/services/orchestrator/providerOrchestratorAdapter.ts

Comment thread apps/ade-cli/src/cli.ts Outdated
Comment thread apps/ade-cli/src/cli.ts Outdated
Comment thread apps/desktop/src/main/services/adeActions/registry.ts
Comment thread apps/desktop/src/main/services/adeActions/registry.ts
Comment thread apps/desktop/src/main/services/orchestrator/aiOrchestratorService.test.ts Outdated
Comment thread apps/desktop/src/main/services/orchestrator/coordinatorTools.ts
Comment thread apps/desktop/src/main/services/orchestrator/executionPolicy.ts
Comment thread apps/desktop/src/main/services/orchestrator/executionPolicy.ts
Comment thread apps/desktop/src/main/services/orchestrator/orchestratorService.ts Outdated
Comment thread apps/desktop/src/main/services/orchestrator/orchestratorService.ts
Comment thread apps/ade-cli/src/adeRpcServer.ts
Comment thread apps/desktop/src/main/services/adeActions/registry.ts
Comment thread apps/desktop/src/main/services/orchestrator/coordinatorTools.ts Outdated
Comment thread apps/desktop/src/main/services/orchestrator/orchestratorService.ts
@arul28 arul28 merged commit 70e5bb3 into main May 16, 2026
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants