The Ultimate Guide to Generating PDFs from HTML with Node.js and Puppeteer
Posted on: August 20, 2025
Ever needed to generate a PDF invoice, a report, or an e-ticket from your web application? It’s a common requirement, but turning dynamic HTML into a pixel-perfect PDF can be surprisingly tricky.
In this guide, we’ll walk you through the entire process of building a robust PDF generation script using Node.js and Puppeteer, a powerful headless browser tool. We’ll cover the basics and also touch on the real-world challenges you might face in a production environment.
Prerequisites
Before we begin, make sure you have Node.js (version 18 or higher) and npm installed on your machine.
Step 1: Setting Up Your Project
First, let’s create a new project directory and initialize it with a package.json
file.
mkdir pdf-generator
cd pdf-generator
npm init -y
Next, we need to install Puppeteer. This package will download a recent version of Chromium that we can control programmatically.
npm install puppeteer
Step 2: The Core PDF Generation Logic
Now for the fun part. Create a file named generate.js and add the following code. This script defines a function that takes a string of HTML, launches a headless browser, and saves the rendered content as a PDF.
// generate.js
import puppeteer from 'puppeteer';
import fs from 'fs';
// Example HTML content for an invoice
const invoiceHtml = `
<html>
<head>
<style>
body { font-family: Arial, sans-serif; margin: 40px; }
.invoice-header { text-align: center; margin-bottom: 40px; }
.invoice-header h1 { margin: 0; }
.item-table { width: 100%; border-collapse: collapse; }
.item-table th, .item-table td { border: 1px solid #ddd; padding: 8px; }
.item-table th { background-color: #f2f2f2; }
.total { text-align: right; margin-top: 20px; font-weight: bold; }
</style>
</head>
<body>
<div class="invoice-header">
<h1>Invoice #123</h1>
<p>Issued: August 20, 2025</p>
</div>
<table class="item-table">
<thead>
<tr><th>Item</th><th>Quantity</th><th>Price</th></tr>
</thead>
<tbody>
<tr><td>Web Development Services</td><td>10</td><td>$150.00</td></tr>
<tr><td>API Consulting</td><td>5</td><td>$200.00</td></tr>
</tbody>
</table>
<div class="total">
Total: $2500.00
</div>
</body>
</html>
`;
async function generatePdfFromHtml(htmlContent) {
let browser;
try {
console.log('Launching browser...');
browser = await puppeteer.launch();
console.log('Opening new page...');
const page = await browser.newPage();
console.log('Setting page content...');
await page.setContent(htmlContent, { waitUntil: 'networkidle0' });
// To ensure all assets like fonts or images are loaded
await page.emulateMediaType('print');
console.log('Generating PDF...');
const pdfBuffer = await page.pdf({
format: 'A4',
printBackground: true
});
console.log('Saving PDF...');
fs.writeFileSync('invoice.pdf', pdfBuffer);
console.log('PDF generated successfully: invoice.pdf');
} catch (error) {
console.error('Error generating PDF:', error);
} finally {
if (browser) {
console.log('Closing browser...');
await browser.close();
}
}
}
generatePdfFromHtml(invoiceHtml);
You can run the script from your teminal:
node generate.js
After a few moments, you’ll have a beautifully rendered invoice.pdf
in your project folder!
Step 3: Real-World Challenges
While the script above works great locally, running Puppeteer in a production environment introduces a new set of challenges:
-
Server Dependencies: Your server needs all the correct system libraries to run headless Chromium. This can be complex to manage, especially in Docker containers.
-
Performance: Launching a full browser instance for every PDF generation can be slow and consume a lot of memory and CPU, which can be costly at scale.
-
Maintenance: You’re now responsible for keeping the server, Node.js, and Puppeteer updated to patch security vulnerabilities.
The Simpler Way: Using a Dedicated API
Building and maintaining this infrastructure is a lot of work. For most projects, a dedicated API is a more efficient and reliable solution.
That’s why I built PageFlow. It’s a simple REST API that handles all the browser management, scaling, and maintenance for you.
Here’s how you would accomplish the exact same task using the PageFlow API:
// pageflow-example.js
import fetch from 'node-fetch'; // or native fetch
import fs from 'fs';
const API_KEY = 'YOUR_API_KEY'; // Get yours from app.pageflow.dev
const invoiceHtml = `...`; // The same HTML string from before
async function generateWithPageFlow() {
try {
const response = await fetch('https://api.pageflow.dev/convert', {
method: 'POST',
headers: {
'Authorization': `Bearer ${API_KEY}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({ html: invoiceHtml })
});
if (!response.ok) {
throw new Error(`API Error: ${response.statusText}`);
}
const pdfBuffer = await response.buffer();
fs.writeFileSync('invoice-api.pdf', pdfBuffer);
console.log('PDF generated with PageFlow: invoice-api.pdf');
} catch (error) {
console.error(error);
}
}
generateWithPageFlow();
The result is the samle high-quality PDF, but your code is simpler, and you have zero infrastructure to maintain.
Conclusion
Building your own PDF generator with Puppeteer is a great way to understand the underlying mechanics. But for a production application where speed, reliability, and your own time are critical, a dedicated API like PageFlow can be a game-changer.
Ready to give it a try? You can get your free API key from the PageFlow Dashboard and generate your first PDF in under a minute.