Skip to content

feat: providers result limits for JSON and NDJSON#149

Draft
lidel wants to merge 2 commits into
mainfrom
fix/json-records-limit-and-overfetch
Draft

feat: providers result limits for JSON and NDJSON#149
lidel wants to merge 2 commits into
mainfrom
fix/json-records-limit-and-overfetch

Conversation

@lidel
Copy link
Copy Markdown
Member

@lidel lidel commented May 14, 2026

Someguy supports JSON (buffered) and NDJSON (streaming) response types.

Problem

JSON delegated routing responses under-delivered: iiuc boxo's DefaultRecordsLimit of 20 acted as a pre-filter, and cacheFallbackIter silently dropped records without addresses.

Anecdotally (cc @aschmahmann), production at delegated-ipfs.dev returned ~4 providers on JSON (when opened in browser) vs ~17 on NDJSON for the same CID (/ipns/ipfs.tech).

As a side problem, NDJSON stream had no cap, which is a bad default.

Fix (This PR)

  • raise the JSON cap to 100 and add a streaming cap of 1000, matching HTTP Routing v1 section 4.1.5; expose both as --records-limit and --streaming-records-limit with SOMEGUY_* env vars; reject negative input, 0 disables the cap
  • over-fetch from the underlying router by 3x (cap 3000) so that, after cacheFallbackIter drops addr-less records, the surfaced count is close to the caller's limit; cap the final stream via a new limitedIter wrapper
    • not a fan of over-fetching, but these are mostly theoretical caps, most of CIDs will never hit 1k any time soon, and if we hit that ever, should be more than enough for retrieval (and a good problem to have)

Alternatives

Note

#149 and #150 fix the same bug two ways. Merge only one.

Feedback welcome, but my gut feeling is the #150 is preferable: it fixes the root cause in the shared layer. Needs ipfs/boxo#1157 first.

JSON delegated routing responses under-delivered: boxo's
DefaultRecordsLimit of 20 acted as a pre-filter, and cacheFallbackIter
silently dropped records without addresses. Production at
delegated-ipfs.dev returned ~4 providers on JSON vs ~17 on NDJSON for
the same CID.

- raise the JSON cap to 100 and add a streaming cap of 1000,
  matching HTTP Routing v1 section 4.1.5; expose both as
  `--records-limit` and `--streaming-records-limit` with `SOMEGUY_*`
  env vars; reject negative input, 0 disables the cap
- over-fetch from the underlying router by 3x (cap 3000) so that,
  after cacheFallbackIter drops addr-less records, the surfaced count
  is close to the caller's limit; cap the final stream via a new
  limitedIter wrapper
@lidel lidel requested a review from a team May 14, 2026 22:44
Comment thread server_cached_router.go Outdated
@lidel
Copy link
Copy Markdown
Member Author

lidel commented May 19, 2026

Triage note:

  • address feedback
  • double-check the max ceiling can be lifted by passing 0 (for DHT)
  • see if we can avoid complexity by fixing this upstream in boxo

@lidel lidel self-assigned this May 19, 2026
@lidel lidel marked this pull request as draft May 19, 2026 14:46
operators can now set --records-limit / --streaming-records-limit to
any positive value; previously the multiplied request to the underlying
router was clamped at 3000, capping the surviving count below high user
limits. passing 0 still disables both the cap and the over-fetch.

- drop cacheFallbackOverfetchMax; overfetchLimit returns
  limit * multiplier for positive limits
- rewrite over-fetch comments: the limit is an early-termination hint,
  not a DHT-walk cost amplifier (per review)
- slim CHANGELOG: move "callers get close to --records-limit" to Fixed,
  drop the over-fetch mechanism from Changed
@lidel lidel added the status/blocked Unable to be worked further until needs are met label May 19, 2026
@lidel
Copy link
Copy Markdown
Member Author

lidel commented May 19, 2026

@guillaumemichel applied feedback, but also did some thinking if doing this only in someguy is the right place (overfetch feels wasteful).

Alternative approach in:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

status/blocked Unable to be worked further until needs are met

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants