Skip to content

fix: clean full-save output and complete flatten cleanup#72

Merged
Mythie merged 2 commits into
mainfrom
fix/clean-full-save-and-flatten
May 18, 2026
Merged

fix: clean full-save output and complete flatten cleanup#72
Mythie merged 2 commits into
mainfrom
fix/clean-full-save-and-flatten

Conversation

@Mythie
Copy link
Copy Markdown
Contributor

@Mythie Mythie commented May 18, 2026

Three independent correctness bugs surfaced while trying to get a
flattened PDF to pass Adobe Reader's LTV (long-term validation) check
after signing. Each is fixed here, along with a related parser
robustness improvement that was already in tree.

  • PDF.getForm() cached the wrapper indefinitely, so form.flatten()
    (which removes /AcroForm from the catalog) left a stale PDFForm
    reachable. A subsequent getOrCreateForm() early-returned that stale
    wrapper and never re-added /AcroForm, so the saved PDF had no form
    even though the API claimed it did. Fixed by invalidating the cache
    whenever the catalog's /AcroForm state diverges from the cache.

  • FormFlattener only marked a widget for removal after successfully
    rendering its appearance. Widgets without a renderable /AP, with an
    invalid /BBox, or that were hidden got skipped before reaching the
    removal set. The parent field was deleted but the widget stayed in
    page /Annots as an orphan, which Adobe still rendered as an
    interactive form field. Fixed by marking the widget for removal up
    front regardless of whether we can draw it.

  • writeComplete() preserved source object numbers, so a full save
    after flatten left xref subsections like "0 5 / 9 11 / 27 7" with
    implicit free entries for the gaps. Adobe's LTV gate treats
    gap-laden baselines as suspect and silently suppresses the
    LTV-enabled badge even when the signature and DSS are valid. Fixed
    by renumbering reachable objects sequentially during full save,
    producing a single contiguous "0 N" subsection.

  • DocumentParser.parseXRefChain() now handles hybrid-reference files
    (PDF 1.7 7.5.8.4): a traditional xref table whose trailer carries
    /XRefStm pointing to a supplementary xref stream. Without this,
    every compressed object in such documents was reported as free.

Regression tests added for each. The writer suite also gains a
dangling-reference check that parses the output's xref and asserts
every "N G R" in any indirect object body points to a live entry.

Three independent correctness bugs surfaced while trying to get a
flattened PDF to pass Adobe Reader's LTV (long-term validation) check
after signing. Each is fixed here, along with a related parser
robustness improvement that was already in tree.

* PDF.getForm() cached the wrapper indefinitely, so form.flatten()
  (which removes /AcroForm from the catalog) left a stale PDFForm
  reachable. A subsequent getOrCreateForm() early-returned that stale
  wrapper and never re-added /AcroForm, so the saved PDF had no form
  even though the API claimed it did. Fixed by invalidating the cache
  whenever the catalog's /AcroForm state diverges from the cache.

* FormFlattener only marked a widget for removal after successfully
  rendering its appearance. Widgets without a renderable /AP, with an
  invalid /BBox, or that were hidden got skipped before reaching the
  removal set. The parent field was deleted but the widget stayed in
  page /Annots as an orphan, which Adobe still rendered as an
  interactive form field. Fixed by marking the widget for removal up
  front regardless of whether we can draw it.

* writeComplete() preserved source object numbers, so a full save
  after flatten left xref subsections like "0 5 / 9 11 / 27 7" with
  implicit free entries for the gaps. Adobe's LTV gate treats
  gap-laden baselines as suspect and silently suppresses the
  LTV-enabled badge even when the signature and DSS are valid. Fixed
  by renumbering reachable objects sequentially during full save,
  producing a single contiguous "0 N" subsection.

* DocumentParser.parseXRefChain() now handles hybrid-reference files
  (PDF 1.7 7.5.8.4): a traditional xref table whose trailer carries
  /XRefStm pointing to a supplementary xref stream. Without this,
  every compressed object in such documents was reported as free.

Regression tests added for each. The writer suite also gains a
dangling-reference check that parses the output's xref and asserts
every "N G R" in any indirect object body points to a live entry.
@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented May 18, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
core Ready Ready Preview, Comment May 18, 2026 8:22am

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 18, 2026

Benchmark Results

Comparison

Load PDF

Benchmark Mean p99 RME Samples
libpdf 2.34ms 4.11ms ±1.9% 214
pdf-lib 39.83ms 44.55ms ±4.0% 13
@cantoo/pdf-lib 40.53ms 47.60ms ±3.7% 13

Create blank PDF

Benchmark Mean p99 RME Samples
libpdf 67μs 150μs ±1.6% 7452
pdf-lib 406μs 1.36ms ±2.2% 1231
@cantoo/pdf-lib 428μs 1.51ms ±2.4% 1171

Add 10 pages

Benchmark Mean p99 RME Samples
libpdf 134μs 235μs ±1.0% 3727
pdf-lib 536μs 1.84ms ±2.8% 933
@cantoo/pdf-lib 480μs 2.30ms ±3.3% 1044

Draw 50 rectangles

Benchmark Mean p99 RME Samples
libpdf 392μs 984μs ±1.5% 1276
pdf-lib 1.94ms 7.68ms ±7.6% 259
@cantoo/pdf-lib 2.12ms 4.82ms ±5.0% 236

Load and save PDF

Benchmark Mean p99 RME Samples
libpdf 2.41ms 4.85ms ±2.3% 208
pdf-lib 87.19ms 100.84ms ±5.2% 10
@cantoo/pdf-lib 156.32ms 164.32ms ±2.1% 10

Load, modify, and save PDF

Benchmark Mean p99 RME Samples
libpdf 54.11ms 63.80ms ±7.3% 10
pdf-lib 89.86ms 99.19ms ±4.7% 10
@cantoo/pdf-lib 155.28ms 163.18ms ±1.8% 10

Extract single page from 100-page PDF

Benchmark Mean p99 RME Samples
libpdf 3.78ms 4.37ms ±0.8% 133
pdf-lib 9.41ms 13.06ms ±2.7% 54
@cantoo/pdf-lib 10.14ms 19.23ms ±4.6% 50

Split 100-page PDF into single-page PDFs

Benchmark Mean p99 RME Samples
libpdf 43.41ms 46.22ms ±2.3% 12
pdf-lib 90.83ms 93.85ms ±2.1% 6
@cantoo/pdf-lib 94.59ms 97.54ms ±3.1% 6

Split 2000-page PDF into single-page PDFs (0.9MB)

Benchmark Mean p99 RME Samples
libpdf 792.13ms 792.13ms ±0.0% 1
pdf-lib 1.68s 1.68s ±0.0% 1
@cantoo/pdf-lib 1.73s 1.73s ±0.0% 1

Copy 10 pages between documents

Benchmark Mean p99 RME Samples
libpdf 4.77ms 5.43ms ±0.8% 105
pdf-lib 12.11ms 14.43ms ±1.7% 42
@cantoo/pdf-lib 13.56ms 14.89ms ±1.6% 37

Merge 2 x 100-page PDFs

Benchmark Mean p99 RME Samples
libpdf 16.00ms 17.34ms ±1.2% 32
pdf-lib 54.54ms 55.31ms ±0.5% 10
@cantoo/pdf-lib 64.77ms 66.22ms ±1.1% 8

Fill FINTRAC form fields

Benchmark Mean p99 RME Samples
libpdf 23.37ms 40.99ms ±8.9% 22
pdf-lib 35.81ms 50.80ms ±7.0% 15
@cantoo/pdf-lib 35.31ms 41.20ms ±5.3% 15

Fill and flatten FINTRAC form

Benchmark Mean p99 RME Samples
libpdf 20.03ms 24.10ms ±3.3% 25
pdf-lib FAILED - - 0
@cantoo/pdf-lib 42.17ms 62.27ms ±10.1% 12
Copying

Copy pages between documents

Benchmark Mean p99 RME Samples
copy 1 page 1.14ms 2.24ms ±2.8% 438
copy 10 pages from 100-page PDF 4.88ms 8.84ms ±3.1% 103
copy all 100 pages 8.18ms 8.93ms ±1.1% 62

Duplicate pages within same document

Benchmark Mean p99 RME Samples
duplicate page 0 1.08ms 1.95ms ±1.4% 463
duplicate all pages (double the document) 1.05ms 1.76ms ±1.0% 476

Merge PDFs

Benchmark Mean p99 RME Samples
merge 2 small PDFs 1.63ms 2.51ms ±1.3% 306
merge 10 small PDFs 8.38ms 11.18ms ±2.0% 60
merge 2 x 100-page PDFs 15.25ms 17.40ms ±1.2% 33
Drawing

benchmarks/drawing.bench.ts

Benchmark Mean p99 RME Samples
draw 100 rectangles 642μs 1.20ms ±1.4% 780
draw 100 circles 1.42ms 2.93ms ±2.6% 353
draw 100 lines 611μs 1.36ms ±1.6% 818
draw 100 text lines (standard font) 1.71ms 2.77ms ±1.7% 293
create 10 pages with mixed content 1.52ms 2.38ms ±1.5% 328
Forms

benchmarks/forms.bench.ts

Benchmark Mean p99 RME Samples
get form fields 3.52ms 8.45ms ±5.2% 142
fill text fields 13.18ms 17.88ms ±4.3% 38
read field values 3.00ms 4.35ms ±1.4% 167
flatten form 8.39ms 11.59ms ±1.9% 60
Loading

benchmarks/loading.bench.ts

Benchmark Mean p99 RME Samples
load small PDF (888B) 70μs 162μs ±3.9% 7173
load medium PDF (19KB) 107μs 188μs ±4.4% 4691
load form PDF (116KB) 1.41ms 2.83ms ±2.1% 355
load heavy PDF (9.9MB) 2.36ms 3.94ms ±1.9% 212
Saving

benchmarks/saving.bench.ts

Benchmark Mean p99 RME Samples
save unmodified (19KB) 113μs 265μs ±2.6% 4421
save with modifications (19KB) 910μs 2.82ms ±3.2% 550
incremental save (19KB) 174μs 350μs ±3.1% 2888
save heavy PDF (9.9MB) 2.49ms 6.21ms ±5.1% 201
incremental save heavy PDF (9.9MB) 8.12ms 10.35ms ±3.4% 62
Splitting

Extract single page

Benchmark Mean p99 RME Samples
extractPages (1 page from small PDF) 1.17ms 2.16ms ±2.8% 429
extractPages (1 page from 100-page PDF) 3.81ms 6.26ms ±2.1% 132
extractPages (1 page from 2000-page PDF) 58.74ms 60.62ms ±1.2% 10

Split into single-page PDFs

Benchmark Mean p99 RME Samples
split 100-page PDF (0.1MB) 41.82ms 46.17ms ±3.3% 12
split 2000-page PDF (0.9MB) 755.58ms 755.58ms ±0.0% 1

Batch page extraction

Benchmark Mean p99 RME Samples
extract first 10 pages from 2000-page PDF 61.35ms 62.87ms ±1.0% 9
extract first 100 pages from 2000-page PDF 65.78ms 67.64ms ±2.1% 8
extract every 10th page from 2000-page PDF (200 pages) 70.57ms 79.76ms ±4.4% 8
Environment
  • Runner: Linux (X64)
  • Runtime: Bun 1.3.14

Results are machine-dependent.

@Mythie Mythie merged commit 2f9cdb2 into main May 18, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant