Skip to content

Workspace cleanup orchestrator hangs on full workspace scan #415

@chubes4

Description

@chubes4

Summary

Both workspace cleanup plan --mode=retention --dry-run and workspace worktree cleanup <repo> hang indefinitely on a workspace with ~15 merged worktrees. The orchestrator never returns output and never completes; only timeout kills it.

The per-worktree workspace worktree remove <repo> <branch> ability works instantly. Manually iterating per-worktree completed all 12 removals in under 30 seconds total. The bulk planner/cleanup paths are the issue.

Reproduction

On a workspace with 15+ worktrees for a single repo (mix of state=active with liveness=stale, plus one state=cleanup_eligible):

```

Hangs >5 minutes, no output:

studio wp datamachine-code workspace worktree cleanup data-machine-code

Hangs >3 minutes, no output:

studio wp datamachine-code workspace cleanup plan --mode=retention --dry-run
```

In contrast:

```

Returns in <2 seconds with clean success:

studio wp datamachine-code workspace worktree remove data-machine-code
```

Evidence

This session ran into the hang while cleaning up DMC's own workspace after merging #411. The DMC primary at `/Users/chubes/Developer/data-machine-code` had:

  • 1 primary (`main`)
  • 15 worktrees, all but 2 merged into main (some via squash, some plain)
  • 1 worktree flagged `state=cleanup_eligible` (the others were `state=active`, `liveness=stale`)
  • 1 worktree (`fix/release-lint-cleanup`) with 36 uncommitted modified files

Both orchestrator paths timed out. Manual per-worktree removal completed 13 worktrees cleanly (12 plain + 1 with `--force` for the dirty one).

Code path

`inc/Cli/Commands/WorkspaceCommand.php::run_cleanup_plan()` calls `wp_get_ability('datamachine/workspace-cleanup-plan')->execute($input)`. The hang is somewhere inside that ability, likely in either:

  • Per-worktree git status probes running synchronously across the whole set
  • Per-worktree GitHub PR-state lookups (the table output shows PR URLs are populated for some rows — could mean each row hits the GitHub API)
  • A deadlock between Action Scheduler and the ability's own task scheduling (cleanup is described as "task-backed")

Suspected scope

The doc string on the cleanup command literally says:

"Destructive runs are scheduled through Data Machine's system-task scheduler and return immediately with a run_id. Dry-runs stay synchronous and delegate to the same workspace abilities as the lower-level workspace worktree cleanup* commands until DB-backed cleanup plans land."

So sync dry-run is documented behavior. The hang is in the synchronous delegation path.

Impact

  • Operator cleanup workflow (`cleanup plan` → `cleanup apply`) is unusable on workspaces with substantial worktree counts.
  • The agent runtime cannot self-clean its own worktree graveyard via the canonical cleanup ability.
  • Forces manual per-worktree iteration, which works but doesn't leverage the planner's safety probes (`dirty`, `unpushed`, `containment`, etc.).

Suggested next steps

  1. Add a hard timeout + early bail in the planner so it cannot exceed a configurable budget without returning a partial plan.
  2. Bound the per-worktree probe set: skip GitHub lookups when not needed, batch-status all worktrees in one `git worktree list --porcelain` pass instead of per-worktree shell-out, etc.
  3. If the hang is in Action Scheduler interaction, fall through to a sync code path when no scheduler is available (e.g. WP-CLI context).
  4. Add CLI feedback (progress dots, candidate counts, current step) so it's clear whether the command is hung or just slow.

AI assistance

  • AI assistance: Yes
  • Tool(s): Claude Code (Sonnet 4.5)
  • Used for: Filed this issue based on direct observation during DMC v0.43.0 release session. The hang was reproduced live, and the comparison against per-worktree remove was performed end-to-end. Chris confirmed all observations.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions