Merging with existing PDF example



  • Hi Jsreport team!

    Today I'm comming to you because I cannot manage to find some example to merge new information into existing pdf document.
    I was following this thread (https://github.com/jsreport/jsreport-pdf-utils/issues/2) for a while now and only get a chance to try it out yersterday.
    If I do execute single template that import the existing PDF through script, then Everything is good.
    However each time I attempt to merge with an existing PDF it end up with an error :/

    Here is the script I use to import the file, I cannot set the file content on the req object but then I cannot set it to res.content in the beforeRender neither so I cannot add any other kind of content in it.

    const http = require('http');
    const fs = require('fs');
    
    function download(url, dest, callback){
                console.log(url);
        http.get(url, function(response, err) {
            if (err) {
                console.log(err);
            }
            const file = fs.createWriteStream(dest);
            response.pipe(file);
            file.on('finish', () => {
                dest = fs.realpathSync(dest);
                file.on('close', callback(dest));
            });
        });
    }
    
    function beforeRender(req, res, done) {
        try {
            req.data.pdfData.pages = req.data.pdfData.pages
                .reverse()
                .map(page => Object.assign(page, {texts: page.texts.reverse()}));
            const tempDir = './temp/pdf';
            fs.existsSync('./temp') || fs.mkdirSync('./temp');
            fs.existsSync(tempDir) || fs.mkdirSync(tempDir);
            const dest = `${tempDir}/${(new Date()).getTime()}_${Math.floor(Math.random() * 100000 + 1)}.pdf`;
            download(req.data.pdf, dest, (realpath) => {
                req.data.dataBuffer = fs.readFileSync(realpath);
                fs.unlinkSync(realpath);
                done();
            });
        } catch (error) {
            console.error(error);
        }
    }
    
    function afterRender(req, res, done) {
        res.content = req.data.dataBuffer;
        done();
    }
    

    And this is the error I got while attempting to merge

    2019-01-16T08:39:29.158Z - info: Starting rendering request 1 (user: null)
    2019-01-16T08:39:29.164Z - info: Rendering template { name: sowesign.concat.report, recipe: chrome-pdf, engine: handlebars, preview: true }
    2019-01-16T08:39:29.170Z - debug: Adding sample data B1Dx4OymX
    2019-01-16T08:39:29.175Z - debug: Resources not defined for this template.
    2019-01-16T08:39:29.184Z - debug: Base url not specified, skipping its injection.
    2019-01-16T08:39:29.187Z - debug: Rendering engine handlebars
    2019-01-16T08:39:29.437Z - debug: Compiled template not found in the cache, compiling
    2019-01-16T08:39:29.441Z - debug: Executing recipe chrome-pdf
    2019-01-16T08:39:29.791Z - debug: Converting with chrome HeadlessChrome/72.0.3617.0 using dedicated-process strategy
    2019-01-16T08:39:30.223Z - debug: Running chrome with params {"printBackground":true,"margin":{}}
    2019-01-16T08:39:30.451Z - info: pdf-utils is starting pdf processing
    Failed to parse pdf. Items, groups and text is not filled: UnknownErrorException: bad XRef entry
    2019-01-16T08:39:30.831Z - debug: Detected 2 pdf operation(s) to process
    2019-01-16T08:39:30.832Z - debug: Running pdf operation prepend
    2019-01-16T08:39:30.834Z - info: Starting rendering request 2 (user: null)
    2019-01-16T08:39:30.836Z - info: Rendering template { name: sowesign.signature.report, recipe: chrome-pdf, engine: handlebars, preview: true }
    2019-01-16T08:39:30.838Z - debug: Inline data specified.
    2019-01-16T08:39:30.842Z - debug: Resources not defined for this template.
    2019-01-16T08:39:30.846Z - debug: Executing script sowesign.signature.script
    2019-01-16T08:39:30.880Z - info: Starting rendering request 3 (user: null)
    2019-01-16T08:39:30.883Z - info: Rendering template { name: sowesign.report, recipe: chrome-pdf, engine: handlebars, preview: true }
    2019-01-16T08:39:30.886Z - debug: Inline data specified.
    2019-01-16T08:39:30.887Z - debug: Resources not defined for this template.
    2019-01-16T08:39:30.889Z - debug: Executing script sowesign.script
    2019-01-16T08:39:31.166Z - debug: http://localhost/Jsreport/cdaf_c.pdf
    2019-01-16T08:39:31.173Z - debug: Base url not specified, skipping its injection.
    2019-01-16T08:39:31.175Z - debug: Rendering engine handlebars
    2019-01-16T08:39:32.230Z - warn: Error when processing render request connect ECONNREFUSED 127.0.0.1:51974 Error: connect ECONNREFUSED 127.0.0.1:51974
        at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1113:14)
    events.js:167
          throw er; // Unhandled 'error' event
          ^
    
    Error [ERR_IPC_CHANNEL_CLOSED]: Channel closed
        at ChildProcess.target.send (internal/child_process.js:628:16)
        at Worker.send (internal/cluster/worker.js:40:28)
        at process.<anonymous> (D:\DEV\www\Groupe_8\jsreport\node_modules\script-manager\lib\worker-servers.js:122:16)
        at process.emit (events.js:182:13)
        at process.EventEmitter.emit (domain.js:442:20)
        at emit (internal/child_process.js:812:12)
        at process._tickCallback (internal/process/next_tick.js:63:19)
    Emitted 'error' event at:
        at ChildProcess.Worker.process.on (internal/cluster/worker.js:25:12)
        at ChildProcess.emit (events.js:182:13)
        at ChildProcess.EventEmitter.emit (domain.js:442:20)
        at process.nextTick (internal/child_process.js:632:35)
        at process._tickCallback (internal/process/next_tick.js:61:11)
    2019-01-16T08:39:32.254Z - warn: Error when processing render request read ECONNRESET Error: read ECONNRESET
        at TCP.onStreamRead (internal/stream_base_commons.js:111:27)
    2019-01-16T08:39:32.283Z - warn: Error when processing render request read ECONNRESET Error: read ECONNRESET
        at TCP.onStreamRead (internal/stream_base_commons.js:111:27)
    2019-01-16T08:39:32.288Z - warn: Error during processing request at http://localhost:5490/api/report/sowesign.concat.report
    

    Could you happen to see what I'm doing wrong here ?

    Thanks in advance!



  • Allow me to add some environment Details (forgot those in the previous post and did some testing since), this bug seems only to appear on my local Windows 10 machine, with Node.js v10.11.0 having the lastest install of , however I just went back to the docker version that we host (stilljsreport v2.3.0`) and the bug listed above seems to have disapeared, however I still cannot manage to output the expected merged document with some printed text and image on top of the original PDF.



  • So this particular script is assigned to a "dummy" template. And it should download remote pdf and replace the template output with it?
    In this case, you should do everything in the afterRender and set res.content with the buffer.

    You are doing now some magic with req.data.pdfData.pages. Not sure what it is.
    Based on the error this dummy template is about to be prepended into the main one.
    The script is evaluated.
    However, it crashes in the handlebars.
    How is your template additionally configured?

    It would save time if you prepare for us a simple demo in the playground.



  • I will prepare the demo and provide it to you in a bit.
    As per the details req.data.pdfData.pages is when we parse the original PDF (from server) the output data is reversed and delivered backward (from last page to first page)
    I'm cleaning the whole thing and follow your advices



  • Hi @jan_blaha here's the requested playground, however the fs is blocked on the playground (which is totally normal) so you might have to try it out on a local environnement
    https://playground.jsreport.net/w/anon/Ze4gsN1J

    Thanks again for your help



  • Did rework the script so it doesn't need fs anymore, removed some missing helper that I had used for debug,

    you will then find 2 template :

    • sowesign.report which is the dummy and only execute the script
    • sowesign-signature.report which call the pdf-utils extension with the first template supposed to be merged with

    if you execute first template, everything behave as it should
    if you exeute second template, you get an error :

    Error: Timeout error during executing script
        at Timeout._onTimeout (/app/node_modules/script-manager/lib/manager-servers.js:152:23)
        at ontimeout (timers.js:498:11)
        at Timer.unrefdHandle (timers.js:611:5)
    

    hope this can help you



  • Hi @jan_blaha ,

    So I managed to make it somehow work (??? no idea why, but now I can get an almost proper merge)
    Now the problem I've seen is that any special charaters (such as éèà' ...) from the original PDF seem to disappear upon merging which is a bit annoying, does pdf-utils change anything regarding the cahrset (I believe it is not, just asking)

    In addition :
    I'm trying to work on that suggestion from @bjrmatos https://github.com/jsreport/jsreport/issues/470 and while I got the same result upon merging, I do manage to succesfully upload a pdf into jsreport workflow, without having to pass by script. However, in this version, it need the client to add the PDF url into the data they send as $pdfInput (taking from the $pdf node that we can use from jsreport), if that can interest you later for jsreport let me know

    Thanks!



  • Now the problem I've seen is that any special charaters (such as éèà' ...) from the original PDF seem to disappear upon merging

    Hm I can't replicate it. See here
    https://playground.jsreport.net/w/anon/ibbWubwr
    Are you able to change that demo to make it disappear chars?



  • Hi Jan!

    I cannot reproduce with the given exemple as it happens only from the imported PDF
    but you can look there as they do disapear on this one (execute sowesign.report by itself first so you can compare)

    https://playground.jsreport.net/w/anon/Ze4gsN1J

    , also I just tried with another PDF from another source and this time charaters that were missing in the first PDF were all present here, I would assume that it might be to how the original PDF was generated (by Word if I got it right) I did however tried with 2 others PDF both generated with word : first one failed because of a max call stack reached :

    2019-01-17T16:06:40.622Z - info: Starting rendering request 1 (user:
    null)
    2019-01-17T16:06:40.625Z - info: Rendering template { name: sowesign.signature.report, recipe: chrome-pdf, engine: handlebars, preview: true }
    2019-01-17T16:06:40.627Z - debug: Inline data specified.
    2019-01-17T16:06:40.628Z - debug: Resources not defined for this template.
    2019-01-17T16:06:40.631Z - debug: Executing script sowesign.signature.script
    2019-01-17T16:06:40.801Z - debug: Base url not specified, skipping its injection.
    2019-01-17T16:06:40.818Z - debug: Rendering engine handlebars
    2019-01-17T16:06:41.000Z - debug: Compiled template not found in the
    cache, compiling
    2019-01-17T16:06:41.002Z - debug: Executing recipe chrome-pdf
    2019-01-17T16:06:41.249Z - debug: Converting with chrome HeadlessChrome/72.0.3617.0 using dedicated-process strategy
    2019-01-17T16:06:41.516Z - debug: Chrome will wait for network iddle
    before printing
    2019-01-17T16:06:46.237Z - debug: Running chrome with params {"printBackground":false,"marginTop":"","height":"297.3mm","width":"210mm","waitForNetworkIddle":true,"margin":{"top":""}}
    2019-01-17T16:06:46.550Z - info: pdf-utils is starting pdf processing2019-01-17T16:06:46.947Z - debug: Detected 1 pdf operation(s) to process
    2019-01-17T16:06:46.948Z - debug: Running pdf operation merge
    2019-01-17T16:06:46.949Z - info: Starting rendering request 2 (user:
    null)
    2019-01-17T16:06:46.951Z - info: Rendering template { name: sowesign.report, recipe: import-pdf, engine: none, preview: true }
    2019-01-17T16:06:46.952Z - debug: Inline data specified.
    2019-01-17T16:06:46.953Z - debug: Resources not defined for this template.
    2019-01-17T16:06:46.955Z - debug: Base url not specified, skipping its injection.
    2019-01-17T16:06:46.960Z - debug: Rendering engine none
    2019-01-17T16:06:46.983Z - debug: Compiled template not found in the
    cache, compiling
    2019-01-17T16:06:46.984Z - debug: Executing recipe import-pdf
    2019-01-17T16:06:47.002Z - debug: Skipping storing report.
    2019-01-17T16:06:47.021Z - info: Rendering request 2 finished in 72 ms2019-01-17T16:06:47.364Z - warn: Error when processing render request Maximum call stack size exceeded RangeError: Maximum call stack size exceeded    at String.replace (<anonymous>)    at new PDFName (D:\DEV\www\Groupe_8\jsreport\node_modules\jsreport-pdfjs\lib\object\name.js:34:17)
        at Function.parse (D:\DEV\www\Groupe_8\jsreport\node_modules\jsreport-pdfjs\lib\object\name.js:94:12)
        at Function.parse (D:\DEV\www\Groupe_8\jsreport\node_modules\jsreport-pdfjs\lib\object\dictionary.js:71:27)
        at Object.exports.parse (D:\DEV\www\Groupe_8\jsreport\node_modules\jsreport-pdfjs\lib\object\value.js:20:30)
        at Function.parse (D:\DEV\www\Groupe_8\jsreport\node_modules\jsreport-pdfjs\lib\object\object.js:68:28)
        at parseObject (D:\DEV\www\Groupe_8\jsreport\node_modules\jsreport-pdfjs\lib\object\reference.js:67:20)
        at PDFReference.get [as object] (D:\DEV\www\Groupe_8\jsreport\node_modules\jsreport-pdfjs\lib\object\reference.js:13:17)
        at Function.addObjectsRecursive (D:\DEV\www\Groupe_8\jsreport\node_modules\jsreport-pdfjs\lib\parser\parser.js:61:35)
        at D:\DEV\www\Groupe_8\jsreport\node_modules\jsreport-pdfjs\lib\parser\parser.js:78:18
    2019-01-17T16:06:47.372Z - warn: Error during processing request at http://localhost:5490/api/report/sowesign.signature.report
    

    this one was quite important and did contain lot of image so I will look deeper into that.

    The second one was shorter and containing lot of special char but some did disapeared again or were replaced (é => i for example). I will try with some more PDF from different sources maybe it is more about how word is encoding the pdf first, which cause conflict during merge operation ?



  • I think I was able to resolve the issue with encoding. I post here when it is released.

    For the second error max call stack reached I need the pdf which causes it.
    Unfortunately, the pdf spec is too complex, so we need to resolve this case by case.



  • Hi @jan_blaha , thanks for your help!!

    I understand the problem of PDF spec, many sources won't implement it the same way, we have to deal with PDF coming from various clients in another product and it is true that we have some conflict while attempting to parse them.

    Regarding the pdf I will send it to you by email, I did made the document shorter durong the test but still encounter the error, if you need the full version let me know

    Thank you again!



  • It is more a razor/asp.net question than jsreport question.
    I am not using razor so not sure.

    I would guess you parse the json using json.net into a dynamic object and pass it as dynamic model to the view.
    However, as I said, not an expert here.



  • Hi Jan, sorry for the delay,

    I'm not sure I understand clearly, is it about the last PDF I sent you? Regarding the reason as of why is max call stack reached ?
    I do not use razor nor any asp.net personnaly so that is why i'm quite confused



  • Ups. Sorry. I posted the answer to the wrong topic.
    Please ignore it.



  • Hi @jan_blaha ,
    I read somewhere you were talking about releasing jsreport 2.4.0 quite soon, will this update include the patch you did mention before or do you still need to look deeper into it ?

    Thanks for your help!!



  • Hi, I am not sure when we will do the full release. We got overwhelmed a bit with other stuff now.
    However, you can get the patch I mentioned earlier using install from git master.

    npm i jsreport/jsreport-pdf-utils
    


  • Hi,
    Thanks for the feedback, I'll look at it right away.

    Thanks again!!



  • Works like a charm, thanks a lot!!


Log in to reply
 

Looks like your connection to jsreport forum was lost, please wait while we try to reconnect.