-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Add experimental PDF/A compliance mode #3269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The diff looks good, maybe some concerns about the licencing of the binary file
@Starfox64 do you think it would solve #3269 ? |
That's the wrong Issue ID I think. |
Indeed, this one: #3442 Could you rebase your work ? 🇫🇷 👋🏻 |
I've done a merge recently and it looks like I'm ahead of master. With regards to #3442 I think this might be fixed by this line https://github.com/dompdf/dompdf/pull/3269/files#diff-f365bcb24d5c081b6bfde4aa59a8eda68824ea9658c362098554fc659dcec1d0R3426 It's funny to see that we are literally trying to do the same thing btw, nothing spells out fun more than trying to embed XML in PDF metadata. |
Okayy, I got it wrong: |
https://github.com/williamdes/dompdf/tree/pdfa-2.0 and https://github.com/williamdes/dompdf/tree/feat/pdfa is a rebased version without merges |
FYI I'm re-targeting this to 3.1 (from 3.0.1) since it's a new feature not a patch. I know there are a lot of issues slate for 3.0.x releases but I suspect most of those will be pushed out once I have a chance to review them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add Fixes: #3442
to this PR ?
It passes nearly all compliance tests except
[Specification: ISO 19005-3:2012, Clause: 6.2.11.4.1, Test number: 1](https://github.com/veraPDF/veraPDF-validation-profiles/wiki/PDFA-Parts-2-and-3-rules#rule-621141-1)
The font programs for all fonts used for rendering within a conforming file shall be embedded within that file, as defined in ISO 32000-1:2008, 9.9. Failed
Let's follow up on #3443
Would you mind reviewing this first batch @bsweeney ? |
FYI, I force pushed a rebase for this branch so the merge commit wasn't necessary. I'll be taking a look at what else we need and comment and/or commit additional changes. |
Thank you, let us know if some help is needed |
FYI, it's possible to achieve PDF/A compliance with the changes here. I'll add a page to the wiki outlining the requirements and provide sample code. To address your specific issue regarding fonts, you should set your default font by passing it in as part of the instantiation options and your document should not specify any fonts that would break compliance. For example, if you style your document with a generic font family then Dompdf will translate that into a core PDF font. So, for example, Something like the following should generate a compliant document:
I'm still thinking through how to ensure text styled with a generic family uses an embeddable font. Most likely it will require remapping the font families prior to render. I'll include that information in the wiki once I have a solution. |
Co-authored-by: William Desportes <williamdes@wdes.fr>
For reference I am posting some documentation I found on mpdf Seems to have been implemented in mpdf/mpdf#558 Maybe we can copy the example and make it for dompdf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I do see some room for improvement, I think it's best to put some of these potential changes off to a future release so we can move forward with getting this out there for those who need the functionality now.
I do have some improvements so hold tight just a little longer.
lib/Cpdf.php
Show resolved
<
6D40
span class="btn-link color-fg-muted f6 Details-content--open">Hide resolved
I ran the code through my local test suite and found a few minor compatibility problems. I also enabled PDF/A when using the PDFLib backend (which did not require many changes). Only thing I haven't been able to address so far is compliance when using CMYK colors/images. I'll look at that issue a bit more, but I suspect addressing CMYK support may have to wait. I think the correct resolution would be to use a consistent color profile in the document, which would require a bit of effort. Plus, the CMYK profiles from ICC are fairly large compared to Dompdf itself (2 - 8 MB for the ones I looked at). |
- Sets appropriate flags on link annotations - Reworks the OutputIntents object as an array of intents. - Adds line breaks where required around object identifiers - Sets max codepoint to 255 for binary mode header bytes
MissingWidth: Some monospace fonts do not include the MissingWidth value in the font metrics (see DejaVu Sans Mono). In this scenario use the width of the space character if it exists. CapHeight: FontLib now retrieves the CapHeight, so use that value if present. Otherwise continue to use the Ascender value.
Hello Brian, this is very good news.
https://github.com/dompdf/dompdf/wiki/PDFA-Support
I think its perfectly fine, to add a hint to the wiki, You could even throw an Exception, IMO |
Indeed, PDFLib will throw exceptions for non-conforming elements. Soft failure does seem to be unsupportive of the end-user goal without at least a notification that the generated PDF will not be compliant. Perhaps it does make sense to throw an exception if a non-conforming element (e.g., CMYK element or non-embeddable font) is used. |
Known non-compliant content include non-embeddable fonts, specifying CMYK colors, and CMYK images.
Awesome, thanks a lot! |
This PR adds experimental PDF/A compliance mode and closes #1106
I'm calling this experimental for multiple reasons:
This still should work for most people on non esoteric documents.
Addendum
Fixes #3442