-
Notifications
You must be signed in to change notification settings - Fork 897
Use alternative libraries for PDF exports #6364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
I optimized things a little bit (so pages that have neither been annotated nor duplicated are not modified at all). |
I tried building with podofo-1.0.0-beta1 - do you need even newer (git master)? Seems so, given that some structures/types are off. Ow wait, there's a beta2 .... going back :| |
OK, that (beta2) works. Noticed:
|
Thanks! Cmake passed cleanly. I've only tested mupdf. |
Fixed! |
Thanks! I've confirmed that the bug has been fixed. |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as outdated.
This comment was marked as outdated.
So, I wasn't very happy with the choices of libraries I made, so I implemented a QPDF-based of the thing (you can try it via
Not quite certain what the best choice is, but I'd lean towards QPDF for now. |
On Fedora, mupdf-cpp-libs+deps weighs in at 55MiB when installed. 6.2MiB are leptonica+tesseract, 46.7MiB mupdf-libs. I think the main reason is embedded fonts (which are trimmed down a bit in Fedora compared to the default) and the overall functionality. From a packager and user perspective, the ideal solution would be export plugins which can be built as subpackages and installed separately. For a mupdf/zathura/qpdfview/... user, the xournalpp-mupdf-plugin does not pull in any extra dependencies which they haven't installed already. Others may prefer other export plugins. Subpackages would give everyone their choice. I know this requires extra effort, but since you've abstracted the interface already ... ;-) |
This comment was marked as resolved.
This comment was marked as resolved.
and then
This is with |
@mjg: would replacing this line by
work? I'm guessing they added an implicit conversion since 11.10.1 |
Yes, this compiles! I had tried with QPDF_FUTURE meanwhile which didn't help. BTW: You might want to
in src/core/pdf/base/QPdfExport.{h,cpp}. |
And I'm guessing setting: |
So, in principle qpdf works. A few things I noticed in comparison:
|
The code for the |
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
PoDoFo maintainer here. Few replies to your very interesting comments:
Should be fixed in master. See my comment here.
You can disable metadata updating when saving with
Oh, yes: this makes no sense at all. I demoted the log to Debug in podofo/podofo@85c6e68 .
This is sad but it will be definitely fixed at some point after 1.0 release. It just didn't escalate in my job so far, as I have other priorities.
Actually it is not that young (first commit dates back to 2006). I'm the current maintainer after the original author retired. The stable release is 0.10.4 but 1.0 is coming soon (I hope 1-2 weeks away) and it will feature a robust (if not stable) API.
We are in contact with Debian/Ubuntu devs so 1.0 will receive a fast track and 0.10.x releases skipped.
Some broken PDFs (not standard compliant) are not supported and will require a recovery mode as implemented in other parsers. Unfortunately this is currently not my priority, so again it's something that will come after 1.0.
PoDoFo will be soon relicensed to dual LGPL 2+/MPL2. Probably shortly after 1.0. |
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
…n.cmake It emerged from comments in[1][2] that current podofo-config.cmake version is not really working to . Guide at[3] was followed. Also added public headers to the target with install target_sources(... FILE_SET ...) as documented in [4], which were absent from exports anyway in podofo-config.cmake as well. This increases the cmake requirement to 3.23. Initial export patches were introduced by vcpkg maintainers, as described here[5] [1] xournalpp/xournalpp#6364 (comment) [2] xournalpp/xournalpp#6364 (comment) [3] https://cmake.org/cmake/help/latest/guide/tutorial/Adding%20Export%20Configuration.html [4] https://cmake.org/cmake/help/latest/command/target_sources.html#file-sets [5] microsoft/vcpkg#31584 (comment)
I greatly improved creation of |
That's great! Thanks a lot!
seem to behave exactly as expected. I encountered an include issue though:
The point seems to be that |
…nfig.cmake exports improvements Should fix the issue reported here xournalpp/xournalpp#6364 (comment)
@bhennion : Thank you for the feedback. It was a regression in the location where the generated Please consider that the automatic config file creation was introduced by a random vcpkg patch I found just lurking their issues, which I decided to merge. I didn't know that somehow this wasn't actually working: header files were not exported, so I have no idea how the headers location resolution would have worked. |
I'm guessing this won't work. The user will have to add the right path to some environment variable. |
In any case, it all seems to work fine now. Thanks a lot! |
Altering
It's the first time ever I try to locate libraries this way, I have to say it's an interesting mechanism for package managers. I am now tagging 1.0-rc1. I will inform you here when 1.0 final is released (few days). |
PoDoFo 1.0.0 released (announcement). |
Add choice in export dialog
Hi! I just wanted to thank you for your code. I just built your pr/podofo branch with A typical use case for me is annotating math courses—mostly generated by LaTeX—with clickable table of contents, links to the bibliography, and links to previous definitions or theorems. This is a huge improvement! Thanks again for your work! |
I played around a bit with alternative libraries for exporting PDF (see #3177)
You can try out two options (one via PoDoFo (you'll need the latest dev version of this library) and one via libmupdf)
They can be enabled via the cmake flags ENABLE_POFODO and ENABLE_MUPDF.
If enabled, you can export via PoDoFo or mupdf from the command line (only there for now) with the options --export-podofo and --export-mupdf (together with foo.xopp -p bar.pdf).
I'm happy to get any feedback on those attempts (especially if someone encounters a pdf/xopp file where the output is not as expected). The TOC, hyperlinks, Pdf-annotations and other data in the PDF should be preserved.
Forwarded message from #3177:
This is exactly what I did. The xopp-annotations are exported to PDF by cairo (without the pdf background) and added to the background pdf as an XObject (some pdf feature). I had made a mistake in the case the background PDF page already had such XObjects.
Now, the PDF file is opened, the annotations are added, and then the pdf is saved again. Under the hood, the library may decompress bit of data, do something on it, and resave it with a different compression level (see e.g. this striking podofo/podofo#163). This may lead to size increases.
Another possible source of (ever so slight) size increase is with the fact that some internal structure is changed on purpose by my code: specifically, the content of a page can be stored in two ways. Either a single compressed stream or an array of streams. In order to add the overlayed xopp-annotations, I chose to always go with an array of streams, even if the original page had a single stream. This way, I can add by hand the stream necessary for displaying the xopp-annotation overlay without ever touching (decompressing) the original stream.
In a similar manner, I had to add some indirections between the page structures and some of their data to minimize size increase in the case where the user duplicated a PDF page (e.g. the xopp shows twice the first page of the PDF background).
All this may add a handful of bytes per page (maybe up to a kB/page, I don't really know).
One last thing is that pages are modified even is they have no annotations on them. This (and more) could/should be optimized out.