GSD

Generating PDFs with Node

Posted by Noah Heinrich on October 2, 2023

Table of contents

What would we do without PDFs? It’s the safest, most secure way of storing visual information, and it’s accessible on every major platform in the world. Dynamically creating a printer-friendly version of a webpage, a user’s invoice or event tickets all require creating secure information that a user can download, email, or print. With such practical functionality, the possibility of using PDFS in the foreseeable future remains high for both end-users and web application developers. 

Technically implementing a PDF generator is surprisingly difficult. There’s no standardized tool for it, and the operation is more complex than one would assume. As much as we’d love to tell you the definitive method for generating PDFs, there simply isn’t one.

If you are lacking time and patience to do it yourself, there are paid options out there. These include:

banner-cta-node-blue.webp

All hassle-free, PDF Crowd is the only one without a free option and, PDF Generator doesn’t have any paid options between free and $59/month.

On the other hand, if you’d rather implement a solution yourself, the best option is to create a headless browser using Node JS. A headless browser is simply a GUI-less browser that can run in the background of a webpage and continue to implement solutions that a browser normally does.  There are several headless Node browsers that generate PDFs.

generating PDFs

PhantomJS

One headless browser that’s been around for some time is PhantomJS. It’s an open-source JavaScript API that’s been in use since 2011. Unfortunately, as of 2018, it is no longer in active development, so it may not continue to be a viable option for much longer. However, it’s syntax is simple to learn even for beginners, so for now it remains a staple among headless browsers.

The easiest way to create a PDF using PhantomJS is the render method. This function allows the headless browser to create a file out of any webpage; not just PDFs, but JPEGs, PNGs, and more. The method for doing this is incredibly simple.

var webPage = require('webpage');
var page = webPage.create();

page.viewportSize = { width: 1920, height: 1080 };
page.open("http://www.google.com", function start(status) {
  page.render('google_home.pdf, {format: 'pdf', quality: '100'});
  phantom.exit();
});


In this example, PhantomJS asynchronously opens the Google homepage in the background of your normal browser, and then renders the entire page as a PDF when it’s finished. If you need to give your users a printer-friendly PDF of your page, you could copy and paste this code with minimal alteration. If you need to render something else, like an invoice or a receipt, you could use a string template, and load it as a web page with PhantomJS in the same manner.

const puppeteer = require("puppeteer");
(async () {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.setViewport({ width: 1440, height: 900, deviceScaleFactor: 2 });
await page.goto("file:///practice/resumeapp/resume.html", {
waitUntil: "networkidle2"
});
await page.pdf({
path: "resume.pdf",
pageRanges: "1",
format: "A4",
printBackground: true
});

await browser.close();
})(); 

Puppeteer

Perhaps the most common headless browser for Node is Puppeteer, an API that is run by Google Chrome. Thanks to that Google backing, Puppeteer is extremely powerful, and theoretically can do anything that Chrome can do. The advantages Puppeteer has over Phantom is that it’s better documented, supported, and more flexible. It is a bit more difficult for a newcomer to create PDFs, but there is plenty of information on how to use it on it’s GitHub page.

banner-cta-node-blue.webp

Puppeteer’s method for creating a PDF out of a webpage is much the same as Phantom’s. It makes a series of asynchronous calls in the background of your browser. Here is an example of the same code as above, done with Puppeteer.

const puppeteer = require('puppeteer');

(async () => {

  const browser = await puppeteer.launch();

  const page = await browser.newPage();

  await page.goto('https://google.com', {waitUntil: 'networkidle2'});

  await page.pdf({path: 'hn.pdf', format: 'A4'});

  await browser.close();

})();


As you can see, it’s nearly the same as the PhantomJS code, but a bit more complicated. However, the advantage of Puppeteer is its versatility. The Puppeteer API has a wide range of options involving PDF generation that you can find
here. The above code produces this result, vs the original page:

undefined

Html-pdf-chrome

A third option for generating PDFs from HTML in Node is by using a library called html-pdf-chrome. This isn’t a separate headless browser, but a library that utilizes Chrome’s built-in headless browsing capabilities for the sole purpose of generating PDFs. Of the methods for conversion, this is the simplest to use that we’ve seen. The only limitation is that you must have a recent version of Chrome or Chromium installed on your machine in order to use it. Like Puppeteer and Phantom, you can convert an external page to a PDF, or load a page from a template to make something more customized. Here is an example of how to use html-pdf-chrome to convert an external page into a PDF.

import * as htmlPdf from 'html-pdf-chrome';
const options: htmlPdf.CreateOptions = {
  port: 9222, // po
};
const url = 'https://github.com/westy92/html-pdf-chrome';
const pdf = await htmlPdf.create(url, options);


As you can see, it’s very simple to use. Unfortunately, it is not as well-documented as Puppeteer and doesn’t seem to have as many options as Puppeteer or Phantom. Html-pdf-chrome is best for simpler jobs, perhaps, while Puppeteer may be a better option for more complex tasks like creating invoices.

These are just a few of the possible solutions available right now. As long as people need PDFs from the web, the field will keep evolving, and new tools will be created. Perhaps one will become the industry standard. In the meantime, one of these options can provide you with a good place to start. If you have found other options for generating PDFs with Node, please let us know! We’re always looking for new solutions.

Sign up to receive tutorials on generating PDFs with Node.
    
Noah Heinrich

Noa Heinrich is an instructor at the Flatiron School in Chicago, and a produce of the weekly gaming podcast Tabletop Potluck.

ButterCMS is the #1 rated Headless CMS

G2 crowd review award G2 crowd review award G2 crowd review award G2 crowd review award G2 crowd review award G2 crowd review award G2 crowd review award G2 crowd review award G2 crowd review award G2 crowd review award G2 crowd review award G2 crowd review award

Don’t miss a single post

Get our latest articles, stay updated!