fix(upgrade): re-verify ML-DSA signature on every cache hit by grumbach · Pull Request #100 · WithAutonomi/ant-node

grumbach · 2026-05-19T07:00:29Z

Summary

The shared upgrade binary cache stored the extracted binary and, on a cache hit, returned it after only a SHA-256 check against a sibling .meta.json. SHA-256 is not a security control: anyone able to write to the shared upgrade cache directory ({data_dir}/upgrades/, shared across all node instances on a host — a co-located process, a shared container volume, a low-privilege foothold on the host) could drop a malicious binary plus a forged matching metadata hash, and the next ant-node instance to upgrade would execute it with no signature verification at all — persistent RCE on every co-located node. The ML-DSA-65 signature covers the archive and was only checked on the initial download, never on a cache hit.

Fix

Cache the signed archive + detached signature instead of the extracted binary. BinaryCache::get_verified_archive re-runs ML-DSA-65 verification on every cache hit; the binary is always extracted fresh from the just-verified archive. A tampered archive, tampered or missing signature, or forged metadata fails verification against the pinned release public key, so a poisoned cache entry is rejected and a fresh verified download runs.
Stage into a process-private temp dir and verify that copy, then extract from the same private path. Closes the verify-vs-extract TOCTOU on the shared cache files: an attacker cannot swap the bytes between when the verifier reads them and when the extractor reads them.
Size policy before any copy or read. fs::symlink_metadata + file_type().is_file() rejects symlinks / FIFOs / devices outright; archive size is bounded by MAX_ARCHIVE_SIZE_BYTES and the signature must be exactly SIGNATURE_SIZE bytes. Otherwise an attacker could plant cached.archive -> /dev/zero (stats as 0 bytes) and force unbounded disk fill in the staging dir or OOM in signature::verify.
Cache only after successful extraction. A validly-signed-but-malformed release no longer becomes a shared cache poison pill that every later node downloads, fails to extract, and re-downloads.
Cache dir tightened to 0700 on Unix as defence in depth (cache_dir.rs); the ML-DSA gate is the primary control.
store_archive mirrors the same size / file-type / signature checks before persisting, so a poisoned entry cannot be created through the supported path either.

Production verifies against the pinned RELEASE_SIGNING_KEY; a #[cfg(test)]-only constructor injects a generated key so the cache is unit-testable without weakening the production trust anchor.

Tests

src/upgrade/binary_cache.rs covers:

Tamper path: stores a validly-signed archive, then overwrites the cached archive with a malicious payload and forges the metadata SHA-256 to match it (defeating the corruption pre-check). The entry is rejected because ML-DSA re-verification runs on every hit.
Post-hit shared-file swap: the verified private copy is unaffected by a swap on the shared cache file after the hit.
Symlink bypass attempt (Unix): planting cached.archive -> /dev/zero is rejected pre-copy by the is_file() check.
Oversize archive / wrong-size signature rejected before any read.
Unsigned archive, missing signature, missing metadata, round-trip storage.

Residual (not addressed here)

Cache entries are not bound to a specific release version: the ML-DSA signing context is constant across versions, so a same-UID attacker who already has any past validly-signed release can plant it under a newer version's cache key and force a downgrade to that old signed binary. Not RCE (still legitimately-signed bytes) and a same-UID attacker has easier paths anyway. Closing it cleanly requires coordinated changes in the release-signing pipeline, ant-keygen, ant-node, and ant-client; tracked in the binary_cache module docs.

Copilot

Pull request overview

This PR hardens the auto-upgrade disk cache against A1 (RCE-class) cache poisoning by ensuring ML-DSA signature verification is performed on every cache hit and by eliminating TOCTOU between verification and extraction.

Changes:

Cache the signed archive + detached signature (not the extracted binary) and re-verify ML-DSA-65 on every cache hit via a private staged copy.
Add best-effort Unix permission tightening for the shared upgrade cache directory (0700).
Update upgrade apply flow to extract from verified cached archives and cache verified artifacts pre-extraction; add regression/TOCTOU unit tests.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
`src/upgrade/cache_dir.rs`	Best-effort permission tightening of the shared upgrade cache directory on Unix.
`src/upgrade/binary_cache.rs`	Redesign cache to store archive+signature, re-verify ML-DSA on every hit, stage into process-private dir, and add A1/TOCTOU-focused tests.
`src/upgrade/apply.rs`	Switch upgrade resolution to reuse verified cached archives and cache verified artifacts before extraction.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

grumbach · 2026-05-20T08:01:28Z

+        // Fast corruption pre-check on the PRIVATE copy (NOT the security
+        // decision). A copy error or truncation surfaces here.
+        let actual_hash = sha256_file(&private_archive).ok()?;
+        if actual_hash != meta.archive_sha256 {
            warn!(
-                "Binary cache SHA-256 mismatch for version {version} (expected {}, got {})",
-                meta.sha256, actual_hash
+                "Binary cache SHA-256 mismatch for version {version} \
+                 (expected {}, got {actual_hash}) — ignoring cache entry",
+                meta.archive_sha256
            );
            return None;


Already addressed — the cleanup helper is called on this path (line 287 in the current branch tip). The original review pointed at line numbers from the pre-squash branch.

grumbach · 2026-05-20T08:01:30Z

+        if let Err(e) = self.verify_archive(&private_archive, &private_sig) {
+            warn!(
+                "Cached archive for version {version} FAILED ML-DSA signature \
+                 re-verification ({e}); discarding cache entry (possible \
+                 on-disk tampering). A fresh verified download will run."
+            );
+            return None;
+        }


Already addressed — same cleanup("signature re-verification failed") path is wired on the signature-verify-fail branch.

Per review-wave findings on PR WithAutonomi#100: 1. The upgrade staging tempdir is now created with 0700 on Unix via `tempfile::Builder::permissions`. Doc comments in binary_cache.rs describe the staging dir as "process-private" / "unreachable to the attacker"; pre-followup the dir inherited the default umask (typically 0755), so a same-host attacker on a different UID could in principle race the verify→extract window. The ML-DSA re-verification on the private copy is the primary defence; this is the small layer that brings the code in line with the doc claim. 2. `get_verified_archive` now cleans up the staged private archive + signature on every rejection path (sha256 read failure, sha256 mismatch, signature re-verification failure). The parent TempDir is already auto-cleaned on drop, so this is hygiene — but it also defends against any future caller that reuses the same private_dir across retries by making each rejection start from a clean slate.

…il cleanup Per second-round review wave (codex xhigh) on PR WithAutonomi#100: 1. [BLOCKER] get_verified_archive had no size policy before copying or reading cached files. signature::verify_from_file* and sha256_file both fs::read() the archive in full. An attacker with cache-dir write access (the A1 threat model) could otherwise drop a multi-GB .archive and force disk exhaustion in the private staging dir, or OOM during re-verification, before the entry was rejected. The download path already enforces MAX_ARCHIVE_SIZE_BYTES = 200 MiB; the cache-hit path now honours the same bound (plus the fixed SIGNATURE_SIZE) BEFORE any copy or read. store_archive itself defence-in-depths the same gate. 2. [SHOULD-FIX] download_verify_extract stored the cache entry BEFORE running extract_binary. A validly-signed but malformed/incompatible release therefore became a shared cache poison pill: every later node would hit the cache, fail extract, and re-download under no lock. Reordered: extract first, cache only on success. 3. [NIT] The cleanup helper was defined AFTER the two fs::copy calls that could leave private_archive (or a partially created private_sig) behind. Moved the helper definition before the copies so every rejection path — including a partial copy failure — runs the same scrub. Three new tests pin the size policy and the new ordering: test_oversize_cached_archive_is_rejected_before_copy test_wrong_size_signature_is_rejected_before_copy test_store_archive_rejects_oversize MAX_ARCHIVE_SIZE_BYTES tightened to pub(super) (one fewer clippy warning). 10/10 binary_cache tests + 485 lib tests pass; cfd clean.

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

grumbach · 2026-05-20T08:01:32Z

+        // `signature::verify_from_file*` and `sha256_file` both load the
+        // archive into memory in full. An attacker with cache-dir write
+        // access (the A1 threat model) could otherwise drop a multi-GB
+        // `.archive` and force disk exhaustion in the staging dir or an OOM
+        // during re-verification before the entry is rejected. The download
+        // path already enforces `MAX_ARCHIVE_SIZE_BYTES`; cache hits must
+        // honour the same bound, plus the fixed `SIGNATURE_SIZE`.


Good catch — comment was wrong. Updated to clarify the memory pressure comes from signature::verify_from_file* (FIPS-204 requires the message as a slice); sha256_file streams 8 KiB chunks. Fixed in 7ba9bb6.

grumbach · 2026-05-20T08:01:34Z

        let meta_json = serde_json::to_string(&meta)
            .map_err(|e| Error::Upgrade(format!("Failed to serialize binary cache meta: {e}")))?;


Fixed — now reads "Failed to serialize cached archive metadata". 7ba9bb6.

… bypass Per round-3 codex review on PR WithAutonomi#100: The size gate from the prior commit used `fs::metadata` (which follows symlinks). A cache-dir writer could plant `ant-node-X.archive -> /dev/zero` (or a FIFO/device); `.len()` stats as 0 so the size gate passed, but `fs::copy` then read the underlying special file indefinitely into the private staging dir, exhausting disk before any signature check ran. Fix: replace `fs::metadata` with `fs::symlink_metadata` in `get_verified_archive` and `store_archive`, and require `file_type().is_file()` before any size check or copy. Symlinks, FIFOs, sockets, and device files are now rejected with a `warn!` before any read or copy of the suspicious path. This narrows the attack surface to "an attacker with cache-dir write who can also write real files there" — they still cannot bypass the ML-DSA signature re-verification on the private copy (the primary A1 defence) and they cannot trigger disk exhaustion via special-file indirection. New regression test on Unix: `test_symlink_cached_archive_is_rejected_before_copy` plants `cached_archive` as a symlink to `/dev/zero` and asserts `get_verified_archive` returns `None` without staging anything. 486 lib tests pass; cfd clean.

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

grumbach · 2026-05-20T08:01:35Z

+        let cached_archive = self.cached_archive_path(version);
+        let cached_sig = self.cached_signature_path(version);
        let meta_path = self.meta_path(version);

        let meta_data = fs::read_to_string(&meta_path).ok()?;


Fixed — the metadata path now goes through the same open-handle gate as the archive and signature: regular-file check on the opened File and a 4 KiB MAX_META_BYTES cap. A symlink-to-/dev/zero or a huge meta file is rejected before any read. New test test_meta_symlink_to_special_file_is_rejected covers it. 7ba9bb6.

grumbach · 2026-05-20T08:01:37Z

+        let archive_meta = match fs::symlink_metadata(&cached_archive) {
+            Ok(m) => m,
+            Err(e) => {
+                debug!("Cannot stat cached archive for {version}: {e}");
+                return None;


Narrowed in 7ba9bb6. New open_regular_capped helper opens archive + signature + meta ONCE each, validates file_type().is_file() and size on the resulting File handle, and the subsequent stream into the private staging dir reads from that open handle (wrapped in Read::take(len)). There is no second path-based stat or open after this point, so a race on the cache path between the validation and the copy cannot redirect what gets staged.

grumbach · 2026-05-20T08:01:39Z

+        // Size policy gate — runs BEFORE we copy or read the cached files.
+        //
+        // `signature::verify_from_file*` and `sha256_file` both load the
+        // archive into memory in full. An attacker with cache-dir write
+        // access (the A1 threat model) could otherwise drop a multi-GB
+        // `.archive` and force disk exhaustion in the staging dir or an OOM


Fixed in the same commit as the other comment-fix — the wording now correctly attributes the in-memory load to signature::verify_from_file*.

The shared upgrade binary cache stored the extracted binary and, on a cache hit, returned it after only a SHA-256 check against a sibling .meta.json. SHA-256 is not a security control: anyone able to write to the shared cache directory (a co-located process, a shared container volume, a low-privilege foothold on the host) could drop a malicious binary plus a forged matching metadata hash, and the next ant-node instance to upgrade would execute it with no signature verification at all — persistent RCE on every co-located node. The ML-DSA-65 signature covers the archive and was only checked on the initial download, never on a cache hit. Changes: - Cache the signed *archive + detached signature* instead of the extracted binary. `BinaryCache::get_verified_archive` re-runs ML-DSA-65 verification on every cache hit; the binary is always extracted fresh from the just-verified archive. A tampered archive, tampered or missing signature, or forged metadata fails verification against the pinned release public key, so a poisoned cache entry is rejected and a fresh verified download runs. - Stage cached files into the caller's process-private temp directory and verify that copy, then extract from the same private path. Closes the verify-vs-extract TOCTOU on the shared cache files: an attacker cannot swap the bytes between when the verifier reads them and when the extractor reads them. - Size policy before any copy or read. `fs::symlink_metadata` + `file_type().is_file()` rejects symlinks / FIFOs / devices outright; archive size is bounded by `MAX_ARCHIVE_SIZE_BYTES` and the signature must be exactly `SIGNATURE_SIZE` bytes. Otherwise an attacker could plant `cached.archive -> /dev/zero` (stats as 0 bytes) and force unbounded disk fill in the staging dir or OOM in `signature::verify`. - Cache only after successful extraction. A validly-signed-but-malformed release no longer becomes a shared cache poison pill that every later node downloads, fails to extract, and re-downloads. - `cache_dir.rs` restricts the shared upgrade cache directory to 0700 on Unix as defence in depth; the ML-DSA gate is the primary control. - `store_archive` mirrors the same size / file-type / signature checks before persisting, so a poisoned entry cannot be created through the supported path either. Tests in `src/upgrade/binary_cache.rs` cover the tamper path (SHA-256-forged swap on disk rejected by the signature re-check), the post-hit shared-file swap (private copy unaffected), the symlink-to- `/dev/zero` bypass attempt, oversize archive / wrong-sized signature rejection, and round-trip storage. Production verifies against the pinned `RELEASE_SIGNING_KEY`; tests use a `#[cfg(test)]`-only constructor that injects a generated key without weakening the production trust anchor. Residual: cache entries are not bound to a specific release version (the ML-DSA signing context is constant across versions), so a same-UID attacker who already has any past validly-signed release can plant it under a newer version's cache key and force a downgrade to that old signed binary. Not RCE (still legitimately-signed bytes) and a same-UID attacker has easier paths anyway; closing it cleanly requires coordinated changes in the release-signing pipeline, ant-keygen, ant-node, and ant-client, and is tracked in the binary_cache module docs.

dirvine

Code Review: `fix(upgrade): re-verify ML-DSA signature on every cache hit`

I've reviewed the diff across all 3 files (+668/−124), the PR description, CI status, and consulted the codebase's signature.rs module.

Summary

This is a well-structured, thorough fix for the cache-poisoning RCE (A1). The approach is architecturally sound and the implementation is careful with security-relevant details. Main concerns below are minor — no blockers.

✅ What's done well

Core threat model is correctly identified and fixed. Caching the extracted binary with only SHA-256 integrity was the vulnerability; caching the signed archive and re-verifying ML-DSA on every hit is the correct remediation.
TOCTOU elimination via private staged copy. Copying the archive+signature into a process-private temp dir before verification, then extracting from that same private copy, definitively defeats a post-verification file swap on the shared cache. The test test_returned_archive_is_private_copy_immune_to_post_hit_swap confirms this.
Defence in depth. Size policy gates (symlink_metadata + is_file() + size bounds) run before any copy or read, preventing symlink/FIFO/device-file attacks and OOM/disk-exhaustion vectors. The 0700 cache dir permissions and store_archive re-verification are additional layered controls.
#[cfg(test)]-only constructor. new_with_verify_key is guarded by #[cfg(test)], ensuring the production trust anchor (the pinned RELEASE_SIGNING_KEY) cannot be weakened. The verify_key: Option<MlDsaPublicKey> — where None means "use embedded production key" and Some(key) is test-only — is a clean pattern.
Archive metadata written last (write-to-temp-then-rename). Readers never observe a partial state. Verified by inspection of store_archive.

🔍 Issues to address

1. macOS test failure — investigate CI log

Test (macos-latest) failed in the latest run. The run is still in progress so logs aren't available yet, but this needs investigation before merge. Likely candidates: ML-DSA key generation timeout on CI hardware (the OnceLock in test_keypair() should cache it, but first generation may be slow), or a filesystem-specific issue with the symlink tests on macOS.

Action: Investigate the failed test log once the run completes.

2. Extraction safety note (informational)

The extract_from_tar_gz function extracts to dest_dir but should be checked for path traversal (entries with ../ in archive paths). This is outside the scope of this PR's cache fix, but worth verifying that the tar extraction doesn't follow symlinks or escape the temp dir.

3. Potential follow-up: version binding

The downgrade attack documented in the module docs is correctly scoped as a follow-up. One possible approach: include the version string inside the archive metadata (as a signed manifest), and have the cache verifier check archive_version == expected_version in addition to the ML-DSA signature. This wouldn't require changing the signing context or key, just adding a structured manifest to the archive. Worth discussing as the next step.

🔍 Minor observations (no action required)

atomic_copy behaviour: The fs::remove_file(dest) before fs::rename is correct for Windows compatibility. On Unix, rename is atomic even when the destination exists.
Concurrent access safety: The exclusive DownloadLockGuard correctly serialises downloads+stores. The read path (get_verified_archive) is lock-free and safe because meta.json is always written last — a reader can never observe a partially-committed write.
Hard links: symlink_metadata + is_file() correctly rejects symlinks but not hard links. Hard links require write permission on the target directory + target, which is the same access level needed for direct modification — acceptable residual.

🧾 Overall assessment

Approved with the macOS test failure needing investigation. The fix is correct, the threat model coverage is thorough, and the implementation is careful. The documented residual (version-unbound cache entries) is a known limitation scoped for follow-up work.

Key strengths:

Cryptographic verification on every cache hit (primary gate)
Private staging eliminates TOCTOU
Size/file-type gates prevent pre-verification attacks
Test coverage includes tamper, TOCTOU, symlink, and overflow scenarios
Production trust anchor is structurally isolated from test code

Review feedback on the upgrade binary cache: - `meta.json` was read with an unbounded `fs::read_to_string`. An attacker with write access to the shared cache directory could plant the metadata sidecar as a symlink to `/dev/zero` or as a huge file and stall the read into a hang/OOM before the archive/sig hardening ran. The metadata path now goes through the same open-once-and-validate gate as the archive: regular-file check on the opened handle, capped at `MAX_META_BYTES` (4 KiB). - Archive + signature staging previously did `symlink_metadata` (path) followed by `fs::copy` (path), leaving a small TOCTOU window where an attacker could race-swap the path to a symlink/FIFO/device or an oversized file between the check and the copy. Both files are now opened once via `open_regular_capped`, validated on the resulting `File` handle (size + file-type), and copied into the private staging dir from the open handle (wrapped in `Read::take(len)` as belt-and-braces against a post-open extension). All subsequent operations on those files use the staged private bytes, never the shared path. - Comment fix: the prior comment claimed `sha256_file` loads the archive into memory in full. It actually streams in 8 KiB chunks; the memory-pressure concern is `signature::verify_from_file*` (FIPS-204 requires the message as a slice). Wording updated. - Stale error message "Failed to serialize binary cache meta" updated to "Failed to serialize cached archive metadata" — the cache now stores archive metadata, not extracted-binary metadata. Two new tests: test_oversized_meta_is_rejected test_meta_symlink_to_special_file_is_rejected (Unix-only) 488 lib tests pass; cfd clean.

grumbach · 2026-05-20T08:01:59Z

Pushed 7ba9bb6 addressing the inline review comments:

Metadata read now goes through the same open-handle gate as the archive and signature: regular-file check + 4 KiB MAX_META_BYTES cap. Closes the meta-symlink-to-/dev/zero vector.
Archive + signature staging now opens each file once, validates is_file() + size on the resulting File handle, and streams from that handle into the private dir (with Read::take(len)). No second path-based stat or open after the validation, so an attacker can no longer race a swap between the check and the copy.
Comments fixed (sha256_file streams 8 KiB chunks; the in-memory load is in signature::verify_from_file*).
Stale error message updated.

Two new tests: test_oversized_meta_is_rejected, test_meta_symlink_to_special_file_is_rejected (Unix). 488 lib tests pass, clippy/fmt clean.

The Copilot comments about cleanup-on-rejection paths (lines 277/295 in the original review) were referring to pre-squash line numbers; the cleanup helper is already wired on every rejection path on the current branch tip — replied inline.

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

+fn open_regular_capped(path: &Path, max_len: u64) -> io::Result<(File, u64)> {
+    let file = OpenOptions::new().read(true).open(path)?;
+    let meta = file.metadata()?;
+    if !meta.file_type().is_file() {
+        return Err(io::Error::new(
+            io::ErrorKind::InvalidInput,
+            "not a regular file (FIFO/device/socket/dir)",
+        ));


Copilot AI review requested due to automatic review settings May 19, 2026 07:00

Copilot started reviewing on behalf of grumbach May 19, 2026 07:00 View session

Copilot AI reviewed May 19, 2026

View reviewed changes

grumbach force-pushed the fix/a1-upgrade-cache-resign-verify branch from 97a1a34 to ae0038a Compare May 20, 2026 04:55

Copilot AI review requested due to automatic review settings May 20, 2026 05:16

Copilot started reviewing on behalf of grumbach May 20, 2026 05:16 View session

Copilot AI reviewed May 20, 2026

View reviewed changes

Copilot AI review requested due to automatic review settings May 20, 2026 06:18

Copilot started reviewing on behalf of grumbach May 20, 2026 06:18 View session

grumbach force-pushed the fix/a1-upgrade-cache-resign-verify branch from 5d8807b to d2fd74e Compare May 20, 2026 06:19

Copilot AI reviewed May 20, 2026

View reviewed changes

grumbach marked this pull request as draft May 20, 2026 06:43

grumbach marked this pull request as ready for review May 20, 2026 06:45

grumbach force-pushed the fix/a1-upgrade-cache-resign-verify branch from d2fd74e to de30b06 Compare May 20, 2026 07:20

grumbach changed the title ~~fix(upgrade): re-verify ML-DSA signature on every cache hit (A1 RCE)~~ fix(upgrade): re-verify ML-DSA signature on every cache hit May 20, 2026

dirvine reviewed May 20, 2026

View reviewed changes

Copilot AI review requested due to automatic review settings May 20, 2026 08:00

Copilot started reviewing on behalf of grumbach May 20, 2026 08:01 View session

Copilot AI reviewed May 20, 2026

View reviewed changes

		let meta_json = serde_json::to_string(&meta)
		.map_err(\|e\| Error::Upgrade(format!("Failed to serialize binary cache meta: {e}")))?;

Conversation

grumbach commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Fix

Tests

Residual (not addressed here)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

grumbach May 20, 2026

Choose a reason for hiding this comment

Uh oh!

grumbach May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

grumbach May 20, 2026

Choose a reason for hiding this comment

Uh oh!

grumbach May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

grumbach May 20, 2026

Choose a reason for hiding this comment

Uh oh!

grumbach May 20, 2026

Choose a reason for hiding this comment

Uh oh!

grumbach May 20, 2026

Choose a reason for hiding this comment

Uh oh!

dirvine left a comment

Choose a reason for hiding this comment

Code Review: fix(upgrade): re-verify ML-DSA signature on every cache hit

Summary

✅ What's done well

🔍 Issues to address

1. macOS test failure — investigate CI log

2. Extraction safety note (informational)

3. Potential follow-up: version binding

🔍 Minor observations (no action required)

🧾 Overall assessment

Uh oh!

grumbach commented May 20, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

grumbach commented May 19, 2026 •

edited

Loading

Code Review: `fix(upgrade): re-verify ML-DSA signature on every cache hit`