would you have any recommendation on size of instance that we should be using in production

There is no way for me to predict this. Try some production load on your test sever and find the best fitting VM size.
Be ready to scale up and down in the production.

? I can definitely also see a lag between the time the node express data extract has all the data and when it is delivered to the report preprocessing script as a json array. I did also import the axios library into js reports to replace the request utility and that seems to work on large datasets without having to specifically code anything to stream data. Would you have any concerns on me doing that?

I'm afraid I don't get what you mean.