Skip to content

fix(orch): mark paused-at-gate sessions as awaiting_input (closes #42)#45

Merged
aksOps merged 1 commit into
mainfrom
fix/v2-rc4-paused-status
May 16, 2026
Merged

fix(orch): mark paused-at-gate sessions as awaiting_input (closes #42)#45
aksOps merged 1 commit into
mainfrom
fix/v2-rc4-paused-status

Conversation

@aksOps
Copy link
Copy Markdown
Contributor

@aksOps aksOps commented May 16, 2026

Summary

Closes #42. Sibling fix to the C#1 finalizer-asymmetry closed in rc3 (#41): graph-paused sessions now flip the row to 'awaiting_input' instead of leaving it at the pre-pause status.

Root cause

PR #41 added the finalizer to non-streaming + service-background ainvoke() call sites, but only on the non-paused branch:

if not await self._is_graph_paused(inc.id):
    await self._finalize_session_status_async(inc.id)
# paused branch: no-op  <-- bug

The paused branch needed a paired write to mark status='awaiting_input' and emit the matching status_changed + session.status_changed events for the React UI to see the transition.

Fix

New lock-guarded helper Orchestrator._mark_session_paused_async(session_id) that mirrors _finalize_session_status_async for the paused side. Four call sites updated:

  • Orchestrator.start_session (direct path)
  • Orchestrator.stream_session (SSE streaming start)
  • Orchestrator.retry_session (retry path)
  • OrchestratorService._run (background task)

The helper is a no-op when the row is already 'awaiting_input' or terminal (guards against a late paused-write unwinding a finalize that landed first).

Regression coverage

tests/test_finalizer_paths.py gains a _PausedAtGateGraph fake + 4 new tests:

  • direct path produces status='awaiting_input' + emits status events
  • service-background path produces same
  • no-op when already 'awaiting_input'
  • no-op on terminal status (guards the race against finalize)

Two pre-existing shim tests (_StubOrch, Orchestrator.__new__ stubs) updated to mock the new method.

Verification

  • ASR_WEB_DIST=/tmp/empty uv run pytest --no-cov1349 passed / 8 skipped (was 1345 / 8 on rc3; +4 new)
  • uv run ruff check src/ tests/ → clean
  • uv run pyright src/runtime → clean
  • cd web && npm run typecheck && npm run lint && npm run test:unit && npm run build && npm run check:size → green
  • dist/ regenerated (HARD-08)

Live behavior (will re-verify after merge + restart): a session that pauses at a HITL gate will now show status='awaiting_input' on GET /api/v1/sessions/{id} and appear in the in-flight /sessions list, the approvals queue, and the rail's Active group.

Test plan

  • CI gates green
  • Merge + tag v2.0.0-rc4
  • Rebuild SPA, restart backend, create a session that triggers gate_fired → confirm status transitions to awaiting_input in real time

The v2.0.0-rc3 finalizer-asymmetry fix (PR #41) correctly handled
graph-completed-without-pause: every non-streaming run now calls
_finalize_session_status_async() to flip the row to a terminal status.

But the *paused* branch was left as a status-write no-op: when the
graph paused at a HITL gate, the session row stayed at its pre-pause
status ('new' / 'in_progress') instead of transitioning to
'awaiting_input'. UIs that filter by status='awaiting_input' (the
approvals queue, the sessions rail Active group, /sessions in-flight
listing) missed paused sessions entirely.

This was the sibling case of CRITICAL #1 and was filed as #42 after
the rc3 ship.

The fix adds a new lock-guarded helper that mirrors
_finalize_session_status_async on the paused side:

    Orchestrator._mark_session_paused_async(session_id) -> str | None

It loads the row, captures the from-status, flips status to
'awaiting_input', saves, and emits both status_changed (per-session
event log) and session.status_changed (cross-session SSE) events via
the existing _emit_status_changed_event helper. It is a no-op when
the row is already at 'awaiting_input', already at a terminal status
(guard against a late paused-write unwinding a finalize that landed
in between), or the row is missing.

Four call sites updated to use it: the three start_session +
stream_session + retry_session paths in orchestrator.py, and the
OrchestratorService._run() background task in service.py.

Regression coverage in tests/test_finalizer_paths.py:
  - _PausedAtGateGraph fake graph that returns from ainvoke and
    reports a non-empty `next` tuple to aget_state (mirrors langgraph
    1.x's interrupt-boundary state).
  - test_orchestrator_start_session_marks_paused_run_awaiting_input:
    direct (non-streaming) path produces status='awaiting_input' on
    the row AND emits status_changed + session.status_changed events.
  - test_service_background_start_session_marks_paused_run_awaiting_input:
    same guarantee through the OrchestratorService._run() task.
  - test_mark_session_paused_is_no_op_when_already_awaiting_input:
    no spurious event emission.
  - test_mark_session_paused_is_no_op_on_terminal_status: guard
    against a late paused-write unwinding a completed terminal flip.

Two pre-existing shim tests updated to stub the new method:
  - tests/test_retry_session_locked_post_policy.py _StubOrch
  - tests/test_triggers/test_orchestrator_trigger_kwarg.py
    (orchestrator built via __new__, bypasses __init__)

Verified:
  - ASR_WEB_DIST=/tmp/empty uv run pytest --no-cov -> 1349 passed, 8 skipped
  - uv run ruff check src/ tests/ -> clean
  - uv run pyright src/runtime -> clean
  - cd web && npm run typecheck/lint/test:unit/build/check:size -> green
  - dist/* regenerated (HARD-08)
@sonarqubecloud
Copy link
Copy Markdown

@aksOps aksOps merged commit e20e119 into main May 16, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: paused-at-gate sessions stuck at status='new' instead of transitioning to 'awaiting_input'

1 participant