AdaWorldAPI · AdaWorldAPI · May 18, 2026 · May 18, 2026 · May 18, 2026 · May 18, 2026
diff --git a/.claude/knowledge/cognitive-distance-typing.md b/.claude/knowledge/cognitive-distance-typing.md
@@ -0,0 +1,166 @@
+# KNOWLEDGE: Cognitive Distance Metrics Are Typed — No Generic Umbrella, No Roundtrips
+
+## READ BY
+- Any agent designing or modifying distance APIs in `src/hpc/{plane,vsa,distance,cascade,causal_diff,seal,merkle_tree,bnn,clam,fingerprint}.rs`
+- Any agent proposing a generic `fn distance<T>(a: &T, b: &T) -> f32` umbrella
+- W7 implementers (when W7 — cognitive bulk ops — moves from deferred to active)
+- The W3-W6 plan-review savant currently auditing the SoA/AoS design (this doc constrains W7's scope and bounds what the SoA helpers should NOT grow into)
+
+## P0 TRIGGERS
+- About to design a generic `distance(a, b) -> f32` that picks the metric internally → STOP, distance is typed; build one fn per metric and name it
+- About to chain `palette 256 → fisher z → cosine → hamming → popcount → palette 256` in one call path → STOP, that's the canonical worst-case roundtrip and erases the typing
+- About to drop the `buckets` / Euler-gamma-offset arguments from a palette-256 distance call → STOP, those are PART of the typed distance, not optional context
+- About to silently convert between metric types in an intermediate step → STOP, conversions must be EXPLICIT (named fns) and documented at the call site as escape hatches
+
+---
+
+## The taxonomy
+
+Each row is a distinct typed primitive. Mixing them at runtime is the bug this doc exists to prevent.
+
+| Metric | Computes | Input | Output | Cascade role | Notes |
+|---|---|---|---|---|---|
+| **Palette 256 distance** | precomputed 256×256 table lookup with bucket + Euler-gamma-offset context | two `PaletteIdx` (u8) + `Buckets` + `EulerGammaOffset` | `PaletteDistance` (typed f32 newtype) | **Level 3** (finalist scoring, ~200 candidates) | The buckets and Euler offset are INTEGRAL to the metric. Dropping them changes the answer; passing zero offset is NOT the same as omitting. |
+| **HDR popcount early-exit** | Hamming on 256-bit bitpacked fingerprints with under-threshold short-circuit | two `&Fingerprint256` + `u16` threshold | `Option<HammingDistance>` (None = exceeded threshold, early exit) | **Level 1** (cosine REPLACEMENT, ~1M → ~20K) | This IS the cosine replacement on the cascade — NOT a derivative or approximation of cosine. The popcount metric directly substitutes for FP cosine in the search topology. |
+| **Base17 L1** | L1 (Manhattan) distance on 17-dim i16 vectors | two `&[i16; 17]` | i32 | **Level 2** (~20K → ~200) | Fits in one AVX-512 load or two NEON loads. The 17-dimension shape is specific; don't pad to 16 or 18. |
+| **Fisher-z transform** | variance-stabilizing transform of correlation → z-score | f32 correlation | f32 z-score | **NOT a distance** — a normalization applied to palette 256 OUTPUT when distance distributions need comparison across heterogeneous buckets | Calling Fisher-z on a non-correlation value is a category error. |
+| **BF16 mantissa exact transformation** | direct palette-256 → palette-256 mapping using BF16 mantissa context | `PaletteIdx` + `EulerGammaOffset` (+ mantissa context) | new `PaletteIdx` | **bypasses the cascade entirely** when the direct mapping is known | The fast path: when you already have a palette index and need the transformed palette index under a known offset, this is one typed hop in palette space. No metric translation, no cascade levels. |
+
+---
+
+## The roundtrip anti-pattern (worst case — do not write this code path)
+
+```
+  palette 256 distance        [Level 3 typed]
+        │
+        ▼
+   fisher z normalize         [valid: variance-stabilize a Level-3 result]
+        │
+        ▼
+   "treated as cosine"        ←── BUG: popcount IS cosine replacement,
+        │                          fisher-z of palette ≠ cosine input
+        ▼
+   hamming distance           [Level 1 typed — wrong scale, wrong topology]
+        │
+        ▼
+   HDR popcount preheat       [Level 1 detail — re-derived from wrong source]
+        │
+        ▼
+   early exit                 [decisions made on un-rounded round-trip]
+        │
+        ▼
+   palette 256 distance       [back to start, BUT: buckets + Euler offset lost
+                               somewhere in the chain — answer differs from
+                               the original Level-3 result by quantization noise
+                               + loss of bucket assignment]
+```
+
+Each arrow:
+- pays an arithmetic / cache cost
+- loses the typed-distance identity (the type system stops protecting the call site)
+- introduces conversion error that compounds along the chain
+- can converge back to "approximately the same number" — which makes the bug invisible in unit tests but wrong in deployment
+
+## The direct path (preferred)
+
+```
+  palette 256 distance ──[Euler gamma offset known]──▶ palette 256 BF16 mantissa exact transformation
+                                                                    │
+                                                                    ▼
+                                                              new PaletteIdx
+                                                              (stays in palette space)
+```
+
+ONE typed step. Stays in palette-256 type space. Preserves bucket + offset throughout. No cascade traversal, no metric translation, no conversion noise.
+
+When the BF16-mantissa direct path is applicable (caller has a `PaletteIdx` and an `EulerGammaOffset`), use it. The cascade exists for the case where you don't yet know which palette band the target lives in.
+
+---
+
+## API design rule (binding)
+
+1. **One fn per metric. Named.**
+   ```rust
+   pub fn palette256_distance(
+       a: PaletteIdx, b: PaletteIdx,
+       buckets: &Buckets, offset: EulerGammaOffset,
+   ) -> PaletteDistance;
+
+   pub fn hdr_popcount_early_exit(
+       a: &Fingerprint256, b: &Fingerprint256, threshold: u16,
+   ) -> Option<HammingDistance>;
+
+   pub fn base17_l1(a: &[i16; 17], b: &[i16; 17]) -> i32;
+
+   pub fn palette256_bf16_mantissa_transform(
+       p: PaletteIdx, offset: EulerGammaOffset, mantissa: BF16MantissaCtx,
+   ) -> PaletteIdx;
+   ```
+
+2. **Conversions are explicit and named.** When a caller must cross metric boundaries (escape-hatch case):
+   ```rust
+   pub fn hamming_distance_to_palette_index_estimate(d: HammingDistance) -> PaletteIdx;
+   //                                  ^-- name says "estimate" so callers can't mistake it for exact
+   ```
+   The call site MUST carry a comment naming WHY the conversion is happening (not the default path).
+
+3. **No `Box<dyn Distance>` / no `enum DistanceMetric { Palette, Hamming, Base17, … }` / no `fn distance<T: HasMetric>(a, b) -> f32` umbrella.**
+   The type system distinguishes the metrics for a reason.
+
+4. **Newtype the output.** `PaletteDistance(f32)`, `HammingDistance(u16)`, `Base17L1(i32)` — different output types prevent accidental cross-metric arithmetic. `let d = palette256_distance(...) + hamming_dist;` should not compile.
+
+5. **The buckets and Euler-gamma-offset arguments to palette-256 fns are REQUIRED, not optional.** Default values for those parameters are domain-meaningful (changing them changes the metric); a `Default::default()` impl on `EulerGammaOffset` is acceptable only with strong documentation that the default is a deliberate calibration constant, not a sentinel.
+
+---
+
+## What this means for the SoA/AoS sprint (W3-W6) and beyond
+
+### W3-W6 (currently in-flight on `claude/w3-w6-soa-aos-helpers`)
+- `SoaVec<T, N>`, `soa_struct!`, `aos_to_soa`, `soa_to_aos`, `bulk_apply`, `bulk_scan` are **generic over T**. They do NOT bake in any distance metric.
+- **DO NOT** during the sprint or in post-review extend any of these primitives toward distance computation. Distance stays out of the helper layer.
+- If a worker is tempted to add `fn bulk_distance<T>(...)` to `bulk.rs` → STOP, that's the umbrella anti-pattern.
+
+### W7 (deferred — cognitive bulk ops)
+- When W7 lands, EACH metric gets its own bulk primitive named for the metric:
+  ```rust
+  pub fn bulk_hdr_popcount_early_exit(
+      query: &Fingerprint256, db: &[Fingerprint256], threshold: u16,
+  ) -> Vec<Option<HammingDistance>>;
+
+  pub fn bulk_palette256_distance(
+      query: PaletteIdx, db: &[PaletteIdx],
+      buckets: &Buckets, offset: EulerGammaOffset,
+  ) -> Vec<PaletteDistance>;
+
+  pub fn bulk_palette256_mantissa_transform(
+      palettes: &[PaletteIdx], offset: EulerGammaOffset, mantissa: BF16MantissaCtx,
+  ) -> Vec<PaletteIdx>;
+  ```
+- Underneath, the bulk fns MAY use `SoaVec` / `bulk_apply` from W3/W4 for layout staging. That's fine — those are layout helpers, not distance helpers.
+- The cascade orchestrator in `hpc/cascade.rs` calls each Level's typed bulk primitive directly. It does NOT internally translate Level-1 outputs to Level-3 inputs by passing through Fisher-z.
+
+### Bench harness (prereq for W7's SIMD acceleration)
+- Per-metric benches: one per typed primitive, no umbrella `bench_distance` macro.
+- The bench output should report the typed primitive's name in the column header so regressions are attributable to the specific metric.
+
+---
+
+## Cross-references
+
+- `CLAUDE.md` § "Three-Level Cascade: How the Search Actually Works" — describes L1 Hamming sweep, L2 Base17 L1, L3 Palette lookup. The Levels are typed distance bands, NOT tiers of a generic distance abstraction.
+- `src/hpc/cascade.rs` — the orchestrator. Inspect before adding any new distance code; the call chain there is the canonical correct example.
+- `src/hpc/distance.rs` — distance utilities. Audit for any `fn distance<T>` umbrella that may have crept in; refactor to typed fns if found.
+- `src/hpc/plane.rs`, `src/hpc/vsa.rs` — produce values that feed the cascade. Their output types must be the typed distance newtypes, not raw `f32`.
+- `.claude/knowledge/w3-w6-soa-aos-design.md` — the W3-W6 helper design (this doc constrains what those helpers must NOT grow into).
+- `.claude/knowledge/vertical-simd-consumer-contract.md` — the layering rule (user → crate::simd → simd_{type}). Same family: layering and typing are both about preserving identity across abstraction levels.
+
+---
+
+## TL;DR for an agent reading this in 30 seconds
+
+1. Palette-256 distance ≠ Hamming popcount ≠ Base17 L1. Don't put them under one API.
+2. Palette-256 distance carries `buckets + EulerGammaOffset` always. Don't drop them.
+3. The fast path inside palette space is the BF16-mantissa direct transform — one hop, no cascade.
+4. The cascade is THREE typed levels in sequence, NOT a generic distance pipeline with intermediate conversions.
+5. Conversions between metric types must be explicit, named, documented per call site.
+6. No `Box<dyn Distance>`, no umbrella `fn distance<T>(...)`, no `enum DistanceMetric`.
diff --git a/.claude/knowledge/w3-w6-codex-audit.md b/.claude/knowledge/w3-w6-codex-audit.md
@@ -0,0 +1,55 @@
+# W3-W6 Codex Audit
+
+Auditor: codex P0 review agent
+Branch: `claude/w3-w6-soa-aos-helpers` (rebased onto master at `3f35170d`)
+Commits under review:
+- `5095853c` feat(hpc/soa): `SoaVec` + `soa_struct!` + `aos_to_soa` + `soa_to_aos` (W3+W5+W6)
+- `5845ab2d` feat(hpc/bulk): `bulk_apply` + `bulk_scan` (W4)
+
+## Verdict
+
+**READY-FOR-PR.** Zero P0 findings. One P1 cosmetic docstring gap (patched on this branch). Three P2 items intentionally deferred per the design contract.
+
+## Verification exit codes (all 0)
+
+| Command | Exit | Notes |
+|---|---|---|
+| `cargo check -p ndarray --no-default-features --features std` | 0 | |
+| `cargo test -p ndarray --lib --no-default-features --features std hpc::soa` | 0 | 29 passed, 0 failed |
+| `cargo test -p ndarray --lib --no-default-features --features std hpc::bulk` | 0 | 16 passed, 0 failed |
+| `cargo test --doc -p ndarray --no-default-features --features std hpc::soa` | 0 | 10 passed, 1 intentionally ignored |
+| `cargo test --doc -p ndarray --no-default-features --features std hpc::bulk` | 0 | 2 passed, 1 intentionally ignored |
+| `cargo fmt --all -- --check` | 0 | |
+| `cargo clippy -p ndarray --no-default-features --features std -- -D warnings` | 0 | |
+
+## P0 findings
+
+None.
+
+## P1 findings
+
+- **F4** — `usize::MAX` chunk-size behavior is tested at `src/hpc/bulk.rs:182-194` but NOT documented in the public docstring of `bulk_apply` (`src/hpc/bulk.rs:46-66`) or `bulk_scan` (`src/hpc/bulk.rs:80-97`). One-line addition: "`chunk_size == usize::MAX` yields the entire slice as a single chunk." **Patched on this branch** before PR opens.
+
+## P2 findings (deferred per design contract)
+
+- **G1** (`bulk_scan` naming): the savant flagged that "scan" conventionally means fold-with-state. Kept as `bulk_scan` for symmetry with `bulk_apply`. Rename to `bulk_for_each` / `bulk_inspect` is a follow-up if downstream consumers find the name misleading.
+- **G2** (`SoaVec::iter_rows`): row iterator yielding `[&T; N]` per row is absent. Use `soa.chunks(1)` for the same effect. Deferred to a follow-up once a real use case exists.
+- **G3** (`SoaVec` lacks `#[derive(Clone, Debug)]`): macro-generated structs DO support derive passthrough (verified by test at `src/hpc/soa.rs:733-742`), but the generic container does not. Deliberately deferred — adding derives would require `where T: Clone + Debug` bounds that callers don't always want.
+
+## D4 — integration test gate
+
+The `bulk_apply` × `aos_to_soa` integration test at `src/hpc/bulk.rs:295` correctly uses `#[cfg(any())]` (canonical never-compile sentinel). The test body at `src/hpc/bulk.rs:297-324` is sound. Now that `hpc/soa.rs` and `hpc/bulk.rs` are landing in the same PR, the gate could be removed as a follow-up — but worker B's `cfg(any())` gate preserves the safe deferral if the PR review wants to keep them independently mergeable.
+
+## Compliance summary
+
+| Concern | Status | Notes |
+|---|---|---|
+| Layering rule (no `#[target_feature]`, no per-arch imports, no raw intrinsics) | ✅ clean | Only doc-prose mentions in module headers; zero actual attributes |
+| Distance typing (no umbrella `fn distance<T>`, no `enum DistanceMetric`, no `Box<dyn Distance>`) | ✅ clean | Both module headers cite `cognitive-distance-typing.md` and warn against extension toward distance |
+| Spec API match (per design doc v2 §C1-C7, §D1-D4) | ✅ exact | `field_n::<const I>()` uses `const { assert!(I < N) }`; all method signatures verbatim |
+| Doc coverage (every `pub fn` has `///` doc with working `# Example`) | ✅ complete | After F4 patch |
+| Test coverage (per design doc §"Tests" per fn) | ✅ complete | 29 + 16 = 45 unit tests + 12 doctests |
+
+## Recommended next step
+
+Open the W3-W6 PR. F4 fix landed on the same branch as a follow-up commit; integration test stays `cfg(any())`-gated for the PR; P2 items deferred.