Skip to content

fix(ci): widen release-darwin poll budget + early-bail on release-go failure#165

Merged
aksOps merged 1 commit into
mainfrom
fix/release-darwin-race
May 14, 2026
Merged

fix(ci): widen release-darwin poll budget + early-bail on release-go failure#165
aksOps merged 1 commit into
mainfrom
fix/release-darwin-race

Conversation

@aksOps
Copy link
Copy Markdown
Contributor

@aksOps aksOps commented May 14, 2026

Summary

Both `release-go.yml` and `release-darwin.yml` fire on the same tag push. `release-go` runs goreleaser (CGO + Kuzu + Cosign + SBOM) — typically 4–8 minutes before the Release object appears. `release-darwin` tried 3 polls × 30s = 90s total and timed out every time:

```
Release v0.X.Y not yet visible, waiting 30s (1/3)...
Release v0.X.Y not yet visible, waiting 30s (2/3)...
Release v0.X.Y not yet visible, waiting 30s (3/3)...
::error::Release v0.X.Y never appeared; release-go.yml may have failed
```

Both v0.3.0 and v0.4.0 needed a manual `gh run rerun --failed` to recover after the upstream Release became visible.

Fix

  1. Bump poll budget to 30 × 30s = 15 minutes. Covers release-go's worst case plus headroom for proxy publish lag.
  2. Early-bail when the upstream tanks. On each poll iteration, check the `release-go.yml` workflow run status for this tag via `gh run list`. If it concluded as `failure` / `cancelled` / `timed_out`, exit immediately with an actionable error rather than riding the full 15-min timeout.
  3. Pin `--repo "$REPO"` on every `gh` invocation. The macOS runner's inferred repo (from `gh auth status`) can drift; explicit pins make the workflow position-independent.

No change to release-go.yml — it was working as designed; the race was entirely on the darwin side's tight polling window.

Test plan

  • YAML parses (`python3 -c "import yaml; yaml.safe_load(...)"`)
  • Pattern hand-traced against the v0.4.0 failure: under the old code, the third poll fired ~90s after job start while release-go was still inside the goreleaser `build` step at 5+ minutes. The new code's 30th poll fires at 15 minutes — well past goreleaser's typical wall.
  • Functional confirmation lands on the next tag push.

Out of scope

  • Switching to a `workflow_run` trigger instead of `push: tags:` would be cleaner long-term (no race possible) but requires reworking how the darwin job retrieves the source SHA — saving for a follow-up.
  • `draft: true` in `.goreleaser.yml` means every release still requires a manual `gh release edit --draft=false` to publish. Defensible default for a security-sensitive release — left in place.

🤖 Generated with Claude Code

…failure

Both release-go.yml and release-darwin.yml fire on the same tag push.
release-go runs goreleaser (CGO + Kuzu + Cosign + SBOM) — typically
4–8 minutes before the Release object appears. release-darwin tried 3
polls × 30s = 90s total and timed out every time:

  Release v0.X.Y not yet visible, waiting 30s (1/3)...
  Release v0.X.Y not yet visible, waiting 30s (2/3)...
  Release v0.X.Y not yet visible, waiting 30s (3/3)...
  ::error::Release v0.X.Y never appeared; release-go.yml may have failed

Both v0.3.0 and v0.4.0 needed a manual `gh run rerun` to recover.

Fix:
  * Bump poll budget to 30 × 30s = 15 minutes (release-go's worst case
    plus headroom).
  * On every poll iteration, also check the release-go workflow run
    status for this tag via `gh run list`. If it concluded as
    failure/cancelled/timed_out, bail with an actionable error instead
    of riding the full 15-min timeout to nowhere.
  * Pin `--repo "$REPO"` on every gh command so the macOS runner's
    inferred repo (from `gh auth status`) can never disagree with the
    actual workflow context.

Verified the YAML still parses; functional verification will land with
the next tag push.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@aksOps aksOps merged commit 6229241 into main May 14, 2026
13 checks passed
@aksOps aksOps deleted the fix/release-darwin-race branch May 14, 2026 13:01
aksOps added a commit that referenced this pull request May 14, 2026
Move the contents of [Unreleased] under a new [v0.4.0] - 2026-05-14
heading. Repopulate [Unreleased] with the three post-v0.4.0 items
already on main: #163 (pflag bump), #164 (harden-runner bump), #165
(release-darwin race fix).

Add a header note explaining the release-history reset: deleting the
pre-v0.4.0 tags from GitHub does not delete them from proxy.golang.org;
every reused tag name would serve the old (often Python-era) content.
v0.4.0 is the first never-previously-used version.

Two factual additions to v0.4.0:
  * PR #162 (module hoist) — was missing from the original Changed
    block when the section was still labelled [Unreleased].
  * PR #161 (BuildInfo fallback) — moved into a dedicated Added bullet
    so `go install` users know their binaries self-identify now.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
aksOps added a commit that referenced this pull request May 14, 2026
Patch release covering the post-v0.4.0 work that's already on main:
  * #163 — github.com/spf13/pflag 1.0.9 → 1.0.10
  * #164 — step-security/harden-runner 2.19.1 → 2.19.2
  * #165 — release-darwin race-fix

Pure CI / dependency hygiene. No codeiq pipeline or detector
behavior changes — same build/test surface as v0.4.0.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant