Skip to Content
BlogGenerating PDFs with Puppeteer

The Ultimate Guide to Generating PDFs from HTML with Node.js and Puppeteer

Posted on: August 20, 2025

Ever needed to generate a PDF invoice, a report, or an e-ticket from your web application? It’s a common requirement, but turning dynamic HTML into a pixel-perfect PDF can be surprisingly tricky.

In this guide, we’ll walk you through the entire process of building a robust PDF generation script using Node.js and Puppeteer, a powerful headless browser tool. We’ll cover the basics and also touch on the real-world challenges you might face in a production environment.

Prerequisites

Before we begin, make sure you have Node.js (version 18 or higher) and npm installed on your machine.


Step 1: Setting Up Your Project

First, let’s create a new project directory and initialize it with a package.json file.

mkdir pdf-generator cd pdf-generator npm init -y

Next, we need to install Puppeteer. This package will download a recent version of Chromium that we can control programmatically.

npm install puppeteer

Step 2: The Core PDF Generation Logic

Now for the fun part. Create a file named generate.js and add the following code. This script defines a function that takes a string of HTML, launches a headless browser, and saves the rendered content as a PDF.

// generate.js import puppeteer from 'puppeteer'; import fs from 'fs'; // Example HTML content for an invoice const invoiceHtml = ` <html> <head> <style> body { font-family: Arial, sans-serif; margin: 40px; } .invoice-header { text-align: center; margin-bottom: 40px; } .invoice-header h1 { margin: 0; } .item-table { width: 100%; border-collapse: collapse; } .item-table th, .item-table td { border: 1px solid #ddd; padding: 8px; } .item-table th { background-color: #f2f2f2; } .total { text-align: right; margin-top: 20px; font-weight: bold; } </style> </head> <body> <div class="invoice-header"> <h1>Invoice #123</h1> <p>Issued: August 20, 2025</p> </div> <table class="item-table"> <thead> <tr><th>Item</th><th>Quantity</th><th>Price</th></tr> </thead> <tbody> <tr><td>Web Development Services</td><td>10</td><td>$150.00</td></tr> <tr><td>API Consulting</td><td>5</td><td>$200.00</td></tr> </tbody> </table> <div class="total"> Total: $2500.00 </div> </body> </html> `; async function generatePdfFromHtml(htmlContent) { let browser; try { console.log('Launching browser...'); browser = await puppeteer.launch(); console.log('Opening new page...'); const page = await browser.newPage(); console.log('Setting page content...'); await page.setContent(htmlContent, { waitUntil: 'networkidle0' }); // To ensure all assets like fonts or images are loaded await page.emulateMediaType('print'); console.log('Generating PDF...'); const pdfBuffer = await page.pdf({ format: 'A4', printBackground: true }); console.log('Saving PDF...'); fs.writeFileSync('invoice.pdf', pdfBuffer); console.log('PDF generated successfully: invoice.pdf'); } catch (error) { console.error('Error generating PDF:', error); } finally { if (browser) { console.log('Closing browser...'); await browser.close(); } } } generatePdfFromHtml(invoiceHtml);

You can run the script from your teminal:

node generate.js

After a few moments, you’ll have a beautifully rendered invoice.pdf in your project folder!

Step 3: Real-World Challenges

While the script above works great locally, running Puppeteer in a production environment introduces a new set of challenges:

  • Server Dependencies: Your server needs all the correct system libraries to run headless Chromium. This can be complex to manage, especially in Docker containers.

  • Performance: Launching a full browser instance for every PDF generation can be slow and consume a lot of memory and CPU, which can be costly at scale.

  • Maintenance: You’re now responsible for keeping the server, Node.js, and Puppeteer updated to patch security vulnerabilities.

The Simpler Way: Using a Dedicated API

Building and maintaining this infrastructure is a lot of work. For most projects, a dedicated API is a more efficient and reliable solution.

That’s why I built PageFlow. It’s a simple REST API that handles all the browser management, scaling, and maintenance for you.

Here’s how you would accomplish the exact same task using the PageFlow API:

// pageflow-example.js import fetch from 'node-fetch'; // or native fetch import fs from 'fs'; const API_KEY = 'YOUR_API_KEY'; // Get yours from app.pageflow.dev const invoiceHtml = `...`; // The same HTML string from before async function generateWithPageFlow() { try { const response = await fetch('https://api.pageflow.dev/convert', { method: 'POST', headers: { 'Authorization': `Bearer ${API_KEY}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ html: invoiceHtml }) }); if (!response.ok) { throw new Error(`API Error: ${response.statusText}`); } const pdfBuffer = await response.buffer(); fs.writeFileSync('invoice-api.pdf', pdfBuffer); console.log('PDF generated with PageFlow: invoice-api.pdf'); } catch (error) { console.error(error); } } generateWithPageFlow();

The result is the samle high-quality PDF, but your code is simpler, and you have zero infrastructure to maintain.

Conclusion

Building your own PDF generator with Puppeteer is a great way to understand the underlying mechanics. But for a production application where speed, reliability, and your own time are critical, a dedicated API like PageFlow can be a game-changer.

Ready to give it a try? You can get your free API key from the PageFlow Dashboard  and generate your first PDF in under a minute.