Cannot append PDFs created by pdftk to report
-
Hello,
I am trying to append a merged PDF document created by pdftk into a new report but am receiving an xref error:
error: Report render failed (because) invalid xref: xref expected but not found Error: Invalid xref: xref expected but not found at PDFXref.parseXrefObject (node_modules\@jsreport\pdfjs\lib\object\xref.js:107:13) at PDFXref.parse (node_modules\@jsreport\pdfjs\lib\object\xref.js:64:19) at Parser.parse (node_modules\@jsreport\pdfjs\lib\parser\parser.js:39:26) at module.exports.parseBuffer (node_modules\@jsreport\pdfjs\lib\parser\parser.js:99:10) at new ExternalDocument (node_modules\@jsreport\pdfjs\lib\external.js:7:20) at Object.append (node_modules\@jsreport\jsreport-pdf-utils\lib\pdfManipulator.js:29:23) at Object.append (node_modules\@jsreport\jsreport-pdf-utils\lib\proxyExtend.js:25:25) at Request._callback (sandbox.js:15:47) at self.callback (node_modules\request\request.js:185:22) at Request.emit (node:events:518:28) rootId=2emrsywhkmxopvw, id=2emrsywhkmxopvw 2024-04-30T22:04:29.879Z - error: Error during processing request at http://localhost:3000/reporting/api/report, details: Invalid xref: xref expected but not found, stack: Error: Invalid xref: xref expected but not found at PDFXref.parseXrefObject (node_modules\@jsreport\pdfjs\lib\object\xref.js:107:13) at PDFXref.parse (node_modules\@jsreport\pdfjs\lib\object\xref.js:64:19) at Parser.parse (node_modules\@jsreport\pdfjs\lib\parser\parser.js:39:26) at module.exports.parseBuffer (node_modules\@jsreport\pdfjs\lib\parser\parser.js:99:10) at new ExternalDocument (node_modules\@jsreport\pdfjs\lib\external.js:7:20) at Object.append (node_modules\@jsreport\jsreport-pdf-utils\lib\pdfManipulator.js:29:23) at Object.append (node_modules\@jsreport\jsreport-pdf-utils\lib\proxyExtend.js:25:25) at Request._callback (sandbox.js:15:47) at self.callback (node_modules\request\request.js:185:22) at Request.emit (node:events:518:28)
The pdf I am trying to append is a merge of multiple pdfs that are also merged pdfs. Let me illustrate what I mean. The below shows the folder structure of my report staging area:
root | ---- subfolder_1 | | | ---- merged_1.pdf <--- created by pdftk by merging report_1.pdf and external.pdf | | | ---- report_1.pdf <-- gets created by jsreport | | | ---- external.pdf <-- uploaded by user, gets appended to report_1.pdf | ---- subfolder_2 | | | ---- merged_2.pdf <-- created by pdftk by merging report_2.pdf, in this case, with nothing | | | ---- report_2.pdf <-- gets created by jsreport | ---- merged.pdf <-- created by pdftk by merging merged_1.pdf and merged_2.pdf
The issue I'm running into is that the
merged.pdf
file does not seem to be able to properly be appended.The report I am trying to append to is doing a simple fetch for the pdf and then merging it via the pdfUtils.append function:
const request = require('request'); const jsreport = require('jsreport-proxy'); async function afterRender (req, res, done) { // fetch pdf content request.get({ url: /*endpoint to fetch merged.pdf*/, }, async (err, response, body) => { let buf = Buffer.from(body); res.content = await jsreport.pdfUtils.append(res.content, buf); done(); }); }
The endpoint it hits is defined as:
let getMergedReport = (req, res, next) => { let pdf = fs.readFileSync(`path\\to\\merged.pdf`); res.send(pdf); }
The reason for doing multiple merges is because I need to be able to append attachments to each subreport before merging them all together.
The report I'm trying to generate is a simple cover page that I would like to be able to show the page count of the entire report after merging it all together.
Is this due to a general incompatibility with pdftk or something else?
Any help or suggestions are appreciated, thanks!
-
The pdf spec is very wide and we primarily focus on compatibility with chrome produced pdfs. Please share your external pdf somewhere so we can take a look why it is failing and if there is some quick fix we can provide.
-
Sure, I've uploaded a sample of the output from pdftk here.
I should also note, after a bit of testing, it seems that even PDFs created by jsreport itself have this issue as well. If I create a report via jsreport and then try to fetch and append that PDF to my report, I also get the same xref error. Is this behavior expected? The reports I am creating through jsreport are using the chrome-pdf recipe with handlebars engine.
A sample of the jsreport output is here.
-
I can append the shared pdf. Maybe you have a problem with your fetching code.
https://playground.jsreport.net/w/anon/boQEZ3Ir
-
Thanks for your help, it seems it was indeed the way I was fetching the data. I opted to use the 'http' module instead of 'request' and buffer everything into a Uint8Array first before trying to append, and that seems to have worked.
That leads me to my next question:
Is there any way to set the $pdf report data in the afterRender function? My assumption is no, but I'm hoping there is a less cumbersome way than having to do:
async function afterRender(req, res, done) { --snip-- // append merged doc res.content = await jsreport.pdfUtils.append(res.content, Buffer.from(buf)); // get $pdf object let $pdf = await jsreport.pdfUtils.parse(res.content); // re-render this template with the $pdf object in data let actual = await jsreport.render({ template: { shortid: 'id', // same template that is currently being rendered }, data: { ...req.data, $pdf: $pdf, }, }); // append merged doc again res.content = await jsreport.pdfUtils.append(actual.content, Buffer.from(buf)); done(); }
-
That looks right. You need to re-render.
-
Sorry, it seems my issue has not been resolved entirely; it's still having issues when trying to append a merged PDF, except now with EOL errors instead of xref errors. This error didn't present itself earlier, I'm assuming, because I was only doing a merge on one PDF using pdftk, which it seems was just doing a simple copy. Now that I'm including merging multiple PDFs, it's giving a different output that is erroring.
Looking at the location it's erroring in, it seems to expect there to be a line ending directly after
endstream
in the PDF, but my output PDF has a space first, then the LF.
Not sure if there is any way to reconcile this easily.
A sample of this file can be found here.
-
Thank you for sharing the pdf.
I've fixed the problem so it should be supported with the next version of jsreport.
We plan to release this week the most likely.