From 847ab4768a30144e2325e0169cea82fc3a7c853b Mon Sep 17 00:00:00 2001 From: Jascha Date: Thu, 14 May 2026 23:09:09 -0700 Subject: [PATCH] Release v0.2.0-rc.1 --- CHANGELOG.md | 163 ++++++++++++++++++++++++++++++++++++++++ CITATION.cff | 4 +- pyproject.toml | 2 +- rust/Cargo.lock | 2 +- rust/Cargo.toml | 2 +- typescript/package.json | 2 +- 6 files changed, 169 insertions(+), 6 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index d73ba04..b673c43 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,169 @@ All notable changes to VectorPin will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.2.0-rc.1] — 2026-05-14 + +Release candidate for 0.2.0. **This is a wire-format break.** Pins +produced by 0.1.x do not verify under the default 0.2.0 verifier; +a `LegacyV1Verifier` is shipped in all three languages as an opt-in +migration aid. The break is the response to a security audit +(2026-05) that identified four cross-implementation issues. See +[`docs/spec.md` §12](docs/spec.md#12-changes-from-v1) for the full +v1 → v2 change list. + +### Protocol — wire-format v2 + +- Protocol version field bumped to `v: 2`. Strict v2 verifiers reject + v1 pins. +- **`v` and `kid` are now signed.** Both are part of the canonical + payload, defeating downgrade attacks and cross-key swap attacks. +- **Domain separator.** Signed bytes are now + `b"vectorpin/v2\x00" || canonical_json(header)` (13-byte tag), + preventing cross-protocol signature reuse with any sister Trust-Stack + protocol. +- **NaN/Inf rejection at sign time.** `+0.0` and `-0.0` remain distinct. +- **NFC normalization mandatory** on every string-typed field + (`model`, `kid`, `ts`, every `extra` key, every `extra` value). + Control characters U+0000–U+001F and bidi overrides U+202A–U+202E / + U+2066–U+2069 are rejected. +- **`extra` is strictly `map`.** Non-string values + cause `PARSE_ERROR`. +- **Strict timestamp format.** `ts` must match exactly + `^[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}Z$`. No + fractional seconds, no offset variants. +- **Unknown top-level fields rejected** at parse time. +- **Size limits enforced** (`docs/spec.md` §4.3): pin JSON ≤ 64 KiB, + ≤ 32 `extra` entries, key ≤ 128 B, value ≤ 1 KiB, `vec_dim` ≤ 2^20, + decoded `sig` length exactly 64. + +### Verification — replay protection and revocation + +- New `KeyEntry` registry shape with optional + `(valid_from, valid_until)` window. Pins whose `ts` falls outside + the window return `KEY_EXPIRED` — separates rotation from + compromise-driven revocation while preserving historical pin + verifiability. +- Replay-protection check: callers may supply + `expected_record_id` / `expected_collection_id` / `expected_tenant_id`, + verified against the reserved `vectorpin.*` keys in `extra`. Returns + `RECORD_MISMATCH` / `COLLECTION_MISMATCH` / `TENANT_MISMATCH` on + divergence (spec §5 step 8). +- Spec failure-mode taxonomy expanded to include `KEY_EXPIRED`, + `PARSE_ERROR`, the three `*_MISMATCH` codes, and `UNSUPPORTED_DTYPE`. + +### Implementations + +All three reference implementations produce byte-for-byte identical +canonical bytes and Ed25519 signatures from the same deterministic +seed (verified by `testvectors/v2.json` and the per-language +cross-language test). + +#### Python + +- `PROTOCOL_VERSION = 2`, `DOMAIN_TAG = b"vectorpin/v2\x00"` exported + from `vectorpin.attestation`. +- `Pin.from_*` strict schema: 64 KiB cap, type/regex/length checks on + every field, `vec_dtype` allowlist, sig length 64 enforced. +- `Verifier` (strict v2) and `LegacyV1Verifier` (opt-in v1+v2). +- `Verifier.verify(..., expected_record_id=..., expected_collection_id=..., expected_tenant_id=...)` + enforces replay-protection bindings. +- `KeyEntry` carries `(valid_from, valid_until)`; `KEY_EXPIRED` fires + per §7. + +#### Rust + +- `pub const DOMAIN_TAG: &[u8] = b"vectorpin/v2\x00"`, + `pub const PROTOCOL_VERSION: u32 = 2` exported. +- New `VerifyError` variants: `KeyExpired`, `ParseError(String)`, + `RecordMismatch`, `CollectionMismatch`, `TenantMismatch`, + `UnsupportedDtype(String)`. +- `VerifyOptions` builder carries replay-protection expected values. +- `LegacyV1Verifier` opt-in. + +#### TypeScript + +- Async signing/verifying API throughout (`signAsync` / `verifyAsync`). + Drops the globally-mutable `ed25519.etc.sha512Sync` hook. +- `Signer.fromPrivateBytes` makes a defensive copy of the seed. + `Signer.wipe()` zeros it. +- Pinned exact crypto deps: `@noble/ed25519@2.3.0`, + `@noble/hashes@1.8.0`. +- Prototype-pollution guards in `pinFromDict`; strict base64url + alphabet enforced before signature decode. + +### Hardening — implementation surface + +Beyond the wire-format break, the audit-driven hardening also closes +implementation-level findings: + +- **Python CLI**: `vectorpin keygen` now writes the private seed with + mode `0o600` via `O_EXCL` (no umask reliance, refuses to clobber an + existing key); parent directory created with mode `0o700`. The + public key is explicitly set to `0o644`. +- **Python adapters**: LanceDB validates `id_column` / `vector_column` + / `pin_column` against an identifier regex and rejects `record_id` + containing NUL, newline, or backslash. Qdrant and Pinecone refuse + an `api_key` over `http://` for non-loopback hosts unless + `VECTORPIN_ALLOW_INSECURE_HTTP=1` is set. +- **Python audit loop**: a single malformed pin in + `audit-{lancedb,chroma,qdrant}` no longer aborts the run; bad rows + are surfaced as `parse_error` and the audit continues. +- **Python `Signer.from_pem`**: requires explicit `password=...` or + `allow_unencrypted=True` to load an unencrypted PEM. Default + behavior refuses. +- **Python dependency bounds**: `cryptography>=42,<46`, + `numpy>=1.26,<3` in `pyproject.toml`. +- **Rust**: `#![forbid(unsafe_code)]` on the crate. + `Signer::generate` returns `Result`. + `Signer::private_key_bytes` returns `Zeroizing<[u8; 32]>`. + `vec_dim` cast via `u32::try_from` on signer + verifier sides. + `Verifier::add_key` returns `Result<(), VerifyError::KeyDecodeFailed>`. + `zeroize = "1"` added as a direct dep. +- **TypeScript**: switched to async signing/verifying API + (`signAsync` / `verifyAsync`), dropping the globally-mutable + `ed25519.etc.sha512Sync` hook. `Signer.fromPrivateBytes` makes a + defensive copy. New `Signer.wipe()` zeros the seed. Module-load + assertion that `crypto.getRandomValues` is available. Prototype- + pollution guards in `pinFromDict`. Sanitized error detail strings + (strip control chars, truncate). `@noble/ed25519@2.3.0` and + `@noble/hashes@1.8.0` pinned to exact versions. + +### Test vectors + +- `testvectors/v2.json` — 4 positive fixtures covering f32, f64, + `model_hash`, and `extra` with `vectorpin.record_id`. Each carries + `expected_canonical_bytes_b64` for cross-language equality assertion. +- `testvectors/negative_v2.json` — 17 fixtures exercising every + failure mode in spec §5: tampered vector, tampered source, wrong + model, wrong `v`, wrong `kid`, bit-flipped sig, wrong sig length, + unknown top-level field, non-string `extra` value, NaN in vector, + NFD source, fractional-seconds `ts`, offset `ts`, lowercase `t`/`z` + `ts`, record_id mismatch, oversize JSON. +- `testvectors/v1.json` and `testvectors/negative_v1.json` retained + for `LegacyV1Verifier` coverage. + +### Migration + +Existing v1 pins do not verify under the strict default v2 verifier +in any language. To migrate a corpus: + +1. Read each pin with `LegacyV1Verifier` (opt-in flag / + constructor / class). +2. Re-sign with the v2 `Signer`, which writes `v: 2` and the new + canonical bytes. +3. Write the re-signed pin back to the vector store. + +Plain re-pinning preserves the bound `(source, vector, model)` triple +while replacing the now-deprecated v1 signature. + +### Documentation + +- New Zensical-rendered documentation site (`docs/`, `zensical.toml`): + index, getting-started, pin-protocol, CLI guide, adapters, detectors, + deployment, security, troubleshooting. The normative protocol + reference remains `docs/spec.md`. Published at + `https://docs.vectorpin.org/` via GitHub Pages. + ## [0.1.1] — 2026-05-07 Patch release. No protocol changes; pins produced by 0.1.0 verify on diff --git a/CITATION.cff b/CITATION.cff index 776e043..6d99162 100644 --- a/CITATION.cff +++ b/CITATION.cff @@ -14,8 +14,8 @@ abstract: >- post-embedding modification breaks signature verification on read. Reference implementations in Python, Rust, and TypeScript are byte-for-byte compatible, locked together by shared test vectors. Part of the ThirdKey Trust Stack. -version: "0.1.1" -date-released: 2026-05-07 +version: "0.2.0-rc.1" +date-released: 2026-05-14 keywords: - vector database - embedding store diff --git a/pyproject.toml b/pyproject.toml index 8f9fde1..85412eb 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta" [project] name = "vectorpin" -version = "0.1.1" +version = "0.2.0rc1" description = "Verifiable integrity for AI embedding stores." readme = "README.md" requires-python = ">=3.11" diff --git a/rust/Cargo.lock b/rust/Cargo.lock index 7d9d84d..6784ce7 100644 --- a/rust/Cargo.lock +++ b/rust/Cargo.lock @@ -644,7 +644,7 @@ dependencies = [ [[package]] name = "vectorpin" -version = "0.1.1" +version = "0.2.0-rc.1" dependencies = [ "base64", "criterion", diff --git a/rust/Cargo.toml b/rust/Cargo.toml index be82265..6cb1c78 100644 --- a/rust/Cargo.toml +++ b/rust/Cargo.toml @@ -3,7 +3,7 @@ resolver = "2" members = ["vectorpin"] [workspace.package] -version = "0.1.1" +version = "0.2.0-rc.1" edition = "2021" rust-version = "1.75" license = "Apache-2.0" diff --git a/typescript/package.json b/typescript/package.json index cb4a809..8bd0720 100644 --- a/typescript/package.json +++ b/typescript/package.json @@ -1,6 +1,6 @@ { "name": "vectorpin", - "version": "0.1.1", + "version": "0.2.0-rc.1", "description": "Verifiable integrity for AI embedding stores. TypeScript reference implementation.", "license": "Apache-2.0", "author": "Jascha Wanger / ThirdKey.ai",