Merging PDFs from download links and creating a TOC



  • So, I have the following json which I send to jsreport:

    {
      dataSheets: [
        {
            "brandName": "Brand A",
            "productName": "product A-1",
            "url": "https://download-link-to-A1.pdf",
       },
       {
            "brandName": "Brand A",
            "productName": "product A-2",
            "url": "https://download-link-to-A2.pdf",
       },
       {
            "brandName": "Brand B",
            "productName": "product B-1",
            "url": "https://download-link-to-B1.pdf",
       }
      ]
    }
    

    How would the best approach be to:

    1. Merge all data sheets into one pdf
    2. Create a TOC for all products

    Some of the PDFs might be more than one page in size.



  • I have tinkered some with this and come a little bit forward, but now I'm stuck. I'm trying to dynamically add an external PDF to my chrome-pdf report. I'm using the node-fetch library for convenience.

    Downloading an external PDF and using it as content works fine.

    const fetch = require('node-fetch');
    
    async function afterRender(req, res, done) {  
        const url = 'https://download-path-for-external-pdf.pdf';
    
        try {
            const pdfStream = await fetch(url);
            const pdfBuffer = await pdfStream.buffer();
            res.content = pdfBuffer;
            done();
        } catch(e) {
            console.log('Failed to load PDF..');
            console.log(e);
            done();
        }
    }
    

    Trying to append the downloaded PDF with the main (chrome-pdf) report does not work.

    Should I do something differently?

    const jsreport = require('jsreport-proxy');
    const fetch = require('node-fetch');
    
    async function afterRender(req, res, done) {  
        const url = 'https://download-path-for-external-pdf.pdf';
    
        try {
            const pdfStream = await fetch(url);
            const pdfBuffer = await pdfStream.buffer();
            const concatenated = await jsreport.pdfUtils.append(res.content, pdfBuffer);
            res.content = concatenated;
            done();
        } catch(e) {
            console.log('Failed to load PDF..');
            console.log(e);
            done();
        }
    }
    

    jsReport Studio shows this:
    0_1567602847135_upload-249e5eaa-09bd-474d-854c-aeade8ae8f0e

    The console shows this:
    Error: EOL expected but not found

    If I change my main report recipie to html and put the caught error.message in the content I get this:
    Invalid PDF: startxref not found


  • administrators

    hi! hmm yes, i tried this and it seems that it is a bug with some internal library that we use, for some reason the pdf parsing from buffer fails with these PDF files. i've created an issue to solve this, you can subscribe there for progress. thanks for making us aware of this problem.



  • Thanks.
    Are you depending on others to fix something? Can I help?
    Being able to build PDFs from both generated content and external PDFs was the main reason I went with jsreport. This is a show-stopper-problem for me.


  • administrators

    we just need to finish the 2.6.0 release and then we start to plan the next tasks.

    Are you depending on others to fix something?

    we can fix it by ourselves (in this case we don't depend of external lib/mantainers), i think @jan_blaha will take care of this, there are some improvements for pdf-utils that need to be done and this issue will be included there too.

    Can i help?

    yes, basically you will need to debug why this line throws error when parsing the external PDF, but it can be tricky because it may require some knowledge of the PDF spec.

    i understand this is a problem that stops you.. the pdf parsing is a bit tricky so from time to time there will be always a case in which the parsing fails and we need to do fix


Log in to reply
 

Looks like your connection to jsreport forum was lost, please wait while we try to reconnect.