Skip to content

feat(security): #502 — compile-time URL/host egress allowlist#956

Merged
proggeramlug merged 1 commit into
mainfrom
feat/502-egress-allowlist
May 18, 2026
Merged

feat(security): #502 — compile-time URL/host egress allowlist#956
proggeramlug merged 1 commit into
mainfrom
feat/502-egress-allowlist

Conversation

@proggeramlug
Copy link
Copy Markdown
Contributor

Closes #502.

Summary

Adds a compile-time HIR pass that gives the host a static guarantee about its binary's outbound network surface. When perry.allowedHosts is set in the host package.json, every literal URL/host in a fetch(...) / net.connect(...) / net.createConnection(...) call must match a pattern in the list — otherwise the build fails before producing a binary.

Zero runtime cost — purely a compile-time HIR walk.

Cross-platform — the gate runs in the platform-agnostic compile_command driver, so every backend (LLVM / WASM / ArkTS / HarmonyOS / Glance / SwiftUI / JS) inherits the protection from one choke point.

Diagnostic example

Error: egress allowlist refused 2 call site(s):
  - /repo/main.ts: fetch → "https://evil.com/leak" (literal host not in `perry.allowedHosts`)
  - /repo/lib/foo.ts: net.connect → "x.evil.com" (literal host not in `perry.allowedHosts`)

`perry.allowedHosts` provides a static guarantee that this binary's
outbound network surface matches the declared list. Refusing the build. (#502)

Options:
- Add the offending host(s) to `perry.allowedHosts` ...
- Set `"*"` in `allowedHosts` to disable host gating ...
- For non-literal URLs, set `perry.allowDynamicHosts: true` ...

All offending sites surface in a single error (better UX than failing on the first one and asking the user to re-run). Capped at 12 entries to keep error output bounded.

Pattern syntax

  • Exact host"api.example.com".
  • Subdomain wildcard"*.cdn.example.com" matches subdomains, not the bare suffix.
  • URL prefix"https://api.acme.com/v1/*" matches path-bearing call sites; does NOT match a bare-host net.connect("api.acme.com") (path-bound entries gate path-bound calls).
  • Universal"*" matches everything (escape hatch).

Opt-in semantics

Empty allowedHosts disables the pass entirely → existing builds compile unchanged. Migration path documented:

  1. Run the build once without the allowlist.
  2. Inspect .perry-cache/audit.json (perry audit --sbom, security: perry audit — emit behavioral SBOM at compile time #495) to discover what egress your binary currently performs.
  3. Populate allowedHosts with the surface you actually use.
  4. Re-build. The gate now catches future regressions.

Non-literal URLs / hosts

Variables, expressions, and template strings with substitutions defeat the static guarantee. They're refused by default; set perry.allowDynamicHosts: true to opt in.

Test coverage

13 unit tests in perry-hir::egress::tests:

  • empty_allowlist_disables_pass — opt-in invariant.
  • host_pattern_exact_match, host_pattern_subdomain_wildcard, universal_escape_hatch — basic shapes.
  • url_pattern_extracts_host — handles userinfo + port + path in URLs.
  • url_prefix_pattern — path-bound matching.
  • url_pattern_doesnt_match_bare_host — net.connect against URL-prefix entries.
  • fetch_literal_records_violation / fetch_literal_matching_passes — primary fetch case.
  • fetch_dynamic_url_blocked_by_default / fetch_dynamic_url_allowed_when_opted_in — non-literal + allowDynamicHosts.
  • net_connect_host_checked / net_connect_no_host_implicit_localhost_allowed — net.connect with and without host.

End-to-end smoke (all four cases verified against the release binary):

  • No allowedHosts → no check (legacy behavior preserved).
  • allowedHosts set, host NOT in list → fails with the right diagnostic.
  • Non-literal URL, no allowDynamicHosts → fails.
  • Non-literal URL, allowDynamicHosts: true → compiles cleanly.

Out of scope

  • http.get(url) / https.request(...) / WebSocket(url) — lower through the general-shape NativeMethodCall HIR variant (URL extraction is harder). The MVP covers the highest-volume egress shape; follow-up grafts the rest onto the same pass.
  • perry audit --sbom literal_hosts integration — security: perry audit — emit behavioral SBOM at compile time #495's manifest is versioned so this is a clean follow-up that adds a new key to ModuleAudit.

Acceptance

  • Host package.json perry.allowedHosts: [...] with glob/URL-prefix patterns
  • All fetch / net.connect / net.createConnection call sites analyzed (http.get/https.request/WebSocket deferred — see module doc)
  • Literal host not in allowlist → build fails at call site with clear message
  • Non-literal host → build fails unless perry.allowDynamicHosts: true
  • Pattern matching: glob-style host wildcards + URL prefix
  • Stronger than runtime allowlists: violations caught before binary exists
  • [deferred] perry audit literal_hosts integration — graft as a literal_hosts key in security: perry audit — emit behavioral SBOM at compile time #495's ModuleAudit (follow-up)

Notes

No Cargo.toml version bump, no CLAUDE.md version line touch, no CHANGELOG.md entry — maintainer folds those in at merge time.

Adds a compile-time HIR pass that refuses fetch(url) and
net.connect(host) / net.createConnection(host) call sites whose
literal URL/host isn't covered by `perry.allowedHosts` in the host
package.json. Non-literal URLs are refused too unless
`perry.allowDynamicHosts: true` is set — preserving the static
"grep-the-binary-for-egress" guarantee.

The check is opt-in: empty `allowedHosts` disables the pass entirely
(existing builds compile unchanged). Once any pattern is set, the gate
is strict. Migration path documented as "use #495's perry audit --sbom
to discover what egress your binary currently performs, then populate
allowedHosts to match."

Pattern syntax:
- exact host: "api.example.com"
- subdomain wildcard: "*.cdn.example.com"
- URL prefix: "https://api.acme.com/v1/*"
- universal escape hatch: "*"

Pre-existing semantics preserved: net.connect with no host argument
(implicit localhost / unix socket) is not gated.

Cross-platform: the gate runs in the platform-agnostic
compile_command driver, so every backend (LLVM / WASM / ArkTS /
HarmonyOS / Glance / SwiftUI / JS) inherits the protection from one
choke point.

Diagnostic surfaces every offending site in a single error so the
user can fix them all at once. Capped at 12 entries to keep error
output reasonable.

Walker (`perry-hir::egress`):
- Covers FetchWithOptions / FetchGetWithAuth / FetchPostWithAuth /
  NetCreateConnection / NetConnect (the highest-volume egress shapes).
- http.get / https.request / WebSocket lower through general-shape
  NativeMethodCall and are deferred to a follow-up under the same
  pass shape.

13 unit tests in perry-hir::egress::tests cover:
- empty allowlist disables the pass
- exact host / subdomain wildcard / URL prefix / universal patterns
- host extraction from full URLs with userinfo / port / path
- bare-host argument against URL-prefix entry does NOT match
- fetch literal refusal + matching pass
- fetch dynamic URL blocked by default + allowed with opt-in
- net.connect host checked
- net.connect without host implicitly allowed

End-to-end smoke (all four cases verified against the release binary):
- no allowedHosts → no check (legacy behavior preserved)
- allowedHosts set, host NOT in list → fails with diagnostic
- non-literal URL, no allowDynamicHosts → fails
- non-literal URL, allowDynamicHosts: true → compiles

Acceptance:
- [x] Host package.json `perry.allowedHosts: [...]` with glob/URL-prefix patterns
- [x] All `fetch` / `net.connect` / `net.createConnection` call sites analyzed (http.get/https.request/WebSocket deferred — see module doc)
- [x] Literal host not in allowlist → build fails at call site with clear message
- [x] Non-literal host → build fails unless `perry.allowDynamicHosts: true`
- [x] Pattern matching: glob-style host wildcards + URL prefix
- [x] Stronger than runtime allowlists: violations caught before binary exists
- [deferred] `perry audit --sbom` lists every literal egress in the build for review — graft in same shape as a `literal_hosts` key (follow-up; #495 ships the v1 manifest shape)
@proggeramlug proggeramlug force-pushed the feat/502-egress-allowlist branch from 37e638a to e085f9c Compare May 18, 2026 10:56
@proggeramlug proggeramlug merged commit 4f1b257 into main May 18, 2026
@proggeramlug proggeramlug deleted the feat/502-egress-allowlist branch May 18, 2026 10:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

security: compile-time URL/host egress allowlist

1 participant