fix: clean full-save output and complete flatten cleanup#72
Merged
Conversation
Three independent correctness bugs surfaced while trying to get a flattened PDF to pass Adobe Reader's LTV (long-term validation) check after signing. Each is fixed here, along with a related parser robustness improvement that was already in tree. * PDF.getForm() cached the wrapper indefinitely, so form.flatten() (which removes /AcroForm from the catalog) left a stale PDFForm reachable. A subsequent getOrCreateForm() early-returned that stale wrapper and never re-added /AcroForm, so the saved PDF had no form even though the API claimed it did. Fixed by invalidating the cache whenever the catalog's /AcroForm state diverges from the cache. * FormFlattener only marked a widget for removal after successfully rendering its appearance. Widgets without a renderable /AP, with an invalid /BBox, or that were hidden got skipped before reaching the removal set. The parent field was deleted but the widget stayed in page /Annots as an orphan, which Adobe still rendered as an interactive form field. Fixed by marking the widget for removal up front regardless of whether we can draw it. * writeComplete() preserved source object numbers, so a full save after flatten left xref subsections like "0 5 / 9 11 / 27 7" with implicit free entries for the gaps. Adobe's LTV gate treats gap-laden baselines as suspect and silently suppresses the LTV-enabled badge even when the signature and DSS are valid. Fixed by renumbering reachable objects sequentially during full save, producing a single contiguous "0 N" subsection. * DocumentParser.parseXRefChain() now handles hybrid-reference files (PDF 1.7 7.5.8.4): a traditional xref table whose trailer carries /XRefStm pointing to a supplementary xref stream. Without this, every compressed object in such documents was reported as free. Regression tests added for each. The writer suite also gains a dangling-reference check that parses the output's xref and asserts every "N G R" in any indirect object body points to a live entry.
Contributor
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Contributor
Benchmark ResultsComparisonLoad PDF
Create blank PDF
Add 10 pages
Draw 50 rectangles
Load and save PDF
Load, modify, and save PDF
Extract single page from 100-page PDF
Split 100-page PDF into single-page PDFs
Split 2000-page PDF into single-page PDFs (0.9MB)
Copy 10 pages between documents
Merge 2 x 100-page PDFs
Fill FINTRAC form fields
Fill and flatten FINTRAC form
CopyingCopy pages between documents
Duplicate pages within same document
Merge PDFs
Drawingbenchmarks/drawing.bench.ts
Formsbenchmarks/forms.bench.ts
Loadingbenchmarks/loading.bench.ts
Savingbenchmarks/saving.bench.ts
SplittingExtract single page
Split into single-page PDFs
Batch page extraction
Environment
Results are machine-dependent. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Three independent correctness bugs surfaced while trying to get a
flattened PDF to pass Adobe Reader's LTV (long-term validation) check
after signing. Each is fixed here, along with a related parser
robustness improvement that was already in tree.
PDF.getForm() cached the wrapper indefinitely, so form.flatten()
(which removes /AcroForm from the catalog) left a stale PDFForm
reachable. A subsequent getOrCreateForm() early-returned that stale
wrapper and never re-added /AcroForm, so the saved PDF had no form
even though the API claimed it did. Fixed by invalidating the cache
whenever the catalog's /AcroForm state diverges from the cache.
FormFlattener only marked a widget for removal after successfully
rendering its appearance. Widgets without a renderable /AP, with an
invalid /BBox, or that were hidden got skipped before reaching the
removal set. The parent field was deleted but the widget stayed in
page /Annots as an orphan, which Adobe still rendered as an
interactive form field. Fixed by marking the widget for removal up
front regardless of whether we can draw it.
writeComplete() preserved source object numbers, so a full save
after flatten left xref subsections like "0 5 / 9 11 / 27 7" with
implicit free entries for the gaps. Adobe's LTV gate treats
gap-laden baselines as suspect and silently suppresses the
LTV-enabled badge even when the signature and DSS are valid. Fixed
by renumbering reachable objects sequentially during full save,
producing a single contiguous "0 N" subsection.
DocumentParser.parseXRefChain() now handles hybrid-reference files
(PDF 1.7 7.5.8.4): a traditional xref table whose trailer carries
/XRefStm pointing to a supplementary xref stream. Without this,
every compressed object in such documents was reported as free.
Regression tests added for each. The writer suite also gains a
dangling-reference check that parses the output's xref and asserts
every "N G R" in any indirect object body points to a live entry.