A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/qpdf/qpdf-dev/discussions/3 below:

Pages · qpdf/qpdf-dev · Discussion #3 · GitHub

The "Pages" epic has been promised for some time. I think it is important to make some progress, preferably for 12.0. The approach I am proposing for the short term is to concentrate on aspects that would be of benefit to pdfarranger. This provides some focus while at the same time providing an opportunity to test drive any enhancements to see how they are performing from a user perspective.

Taking this approach, the initial targets would be

with the next target of

make page assembly and transformations using job JSON more practical

There are two aspects to this

For both items the next step is to refactor QPDFJob::handlePageSpecs. This should also provide an opportunity to substantially reduce qpdf's footprint when combining pages from multiple files, where qpdf can be very memory hungry. For example, to extract a single page from the pdf-spec test file uses over 22.5MB and grows pretty much linearly with the number of input files. By the time we reach the default keep-files-open threshold we are looking at a 4.5GB footprint.

Splitting the work of:handlePageSpecs into two stages - copying the page objects into the primary input followed by inserting the pages into the page tree in the required order - would allow us to have only two QPDF objects in memory at a time - the primary input and one other.

preserve hyperlinks from foreign pdfs

The next main step is to import named destinations from foreign files. The main issue seems to be dealing with name clashes.

preserve outlines from foreign pdfs

This will also rely on the import of named destinations and therefore logically follows preservation of links.

The main difficulty I see is how outlines from multiple pdfs should be merged. One idea is to simply concatenate the top-level entries from the various inputs and provide a facility to dump the outlines to a JSON file for editing and reloading. Trying to come up with an interface that allows users to specify how outlines should be combined feels challenging. My gut feeling is that in most of the more complex cases it will require user interaction and therefore should not be handled by qpdf directly.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4