feat: v2.0.0-rc3 — fix audit findings (finalizer, state_overrides, idempotency + 6 important)#41
Merged
Merged
Conversation
|
This was referenced May 16, 2026
Closed
Open
aksOps
added a commit
that referenced
this pull request
May 16, 2026
#45) The v2.0.0-rc3 finalizer-asymmetry fix (PR #41) correctly handled graph-completed-without-pause: every non-streaming run now calls _finalize_session_status_async() to flip the row to a terminal status. But the *paused* branch was left as a status-write no-op: when the graph paused at a HITL gate, the session row stayed at its pre-pause status ('new' / 'in_progress') instead of transitioning to 'awaiting_input'. UIs that filter by status='awaiting_input' (the approvals queue, the sessions rail Active group, /sessions in-flight listing) missed paused sessions entirely. This was the sibling case of CRITICAL #1 and was filed as #42 after the rc3 ship. The fix adds a new lock-guarded helper that mirrors _finalize_session_status_async on the paused side: Orchestrator._mark_session_paused_async(session_id) -> str | None It loads the row, captures the from-status, flips status to 'awaiting_input', saves, and emits both status_changed (per-session event log) and session.status_changed (cross-session SSE) events via the existing _emit_status_changed_event helper. It is a no-op when the row is already at 'awaiting_input', already at a terminal status (guard against a late paused-write unwinding a finalize that landed in between), or the row is missing. Four call sites updated to use it: the three start_session + stream_session + retry_session paths in orchestrator.py, and the OrchestratorService._run() background task in service.py. Regression coverage in tests/test_finalizer_paths.py: - _PausedAtGateGraph fake graph that returns from ainvoke and reports a non-empty `next` tuple to aget_state (mirrors langgraph 1.x's interrupt-boundary state). - test_orchestrator_start_session_marks_paused_run_awaiting_input: direct (non-streaming) path produces status='awaiting_input' on the row AND emits status_changed + session.status_changed events. - test_service_background_start_session_marks_paused_run_awaiting_input: same guarantee through the OrchestratorService._run() task. - test_mark_session_paused_is_no_op_when_already_awaiting_input: no spurious event emission. - test_mark_session_paused_is_no_op_on_terminal_status: guard against a late paused-write unwinding a completed terminal flip. Two pre-existing shim tests updated to stub the new method: - tests/test_retry_session_locked_post_policy.py _StubOrch - tests/test_triggers/test_orchestrator_trigger_kwarg.py (orchestrator built via __new__, bypasses __init__) Verified: - ASR_WEB_DIST=/tmp/empty uv run pytest --no-cov -> 1349 passed, 8 skipped - uv run ruff check src/ tests/ -> clean - uv run pyright src/runtime -> clean - cd web && npm run typecheck/lint/test:unit/build/check:size -> green - dist/* regenerated (HARD-08)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



Findings Fixed
CRITICAL refactor: prompt-vs-code remediation — code as authority, not the prompt #1 - Background and non-streaming runs now finalize terminal status.
CRITICAL refactor: framework de-coupling — generic runtime, ASR as use case (v1.1) #2 -
state_overridespersist beyondenvironment.CRITICAL v1.2: Framework Owns Flow Control (FOC-01..FOC-06) #3 - Webhook idempotency uses atomic reservation before starting sessions.
IMPORTANT refactor: prompt-vs-code remediation — code as authority, not the prompt #1 - Recent-events SSE emits React-compatible session summary payloads.
IMPORTANT refactor: framework de-coupling — generic runtime, ASR as use case (v1.1) #2 - Backend emits
session.agent_runningstart and terminal clear events.IMPORTANT v1.2: Framework Owns Flow Control (FOC-01..FOC-06) #3 - Per-session reducer handles real backend delta shapes.
IMPORTANT v1.3: Hardening + Real-LLM Compatibility (HARD-01..09 + LLM-COMPAT-01 + BUNDLER-01 + SKILL-LINTER-01) #4 - Vector write-through failures no longer bubble after SQL commit.
IMPORTANT v1.4: per-step telemetry + auto-learning intake + React-ready API #5 - Generic session start accepts
state_overridesand rejects environment conflicts.IMPORTANT fix(hitl): HITL approve/reject end-to-end + gpt-oss markdown reliability #6 - TS/Python literal drift is corrected and guarded by parity tests.
STYLE refactor: prompt-vs-code remediation — code as authority, not the prompt #1 - CORS config is wired through
cfg.api.STYLE refactor: framework de-coupling — generic runtime, ASR as use case (v1.1) #2 - Unknown-route envelope test now probes an API path.
STYLE v1.2: Framework Owns Flow Control (FOC-01..FOC-06) #3 - Web test warning noise was not addressed in this PR.
Description, Reactact(...), Select read-only, and NO_COLOR warning cleanup.Verification
ASR_WEB_DIST=/tmp/empty uv run pytest -x --no-cov- 1345 passed, 8 skippeduv run ruff check src/ tests/- passeduv run pyright src/runtime- 0 errorscd web && npm run typecheck && npm run lint && npm run test:unit && npm run build && npm run check:size- passeduv run python scripts/build_single_file.py- regenerateddist/bundlesNotes
config/config.yamland.superpowers/unstaged.web/src/state/selectedRef.tsx; tests still emit the known dialog/act/select warning noise.