Puppeteer Browser Automation for Beginners

Published in

ramosly | blog

11 min readApr 26, 2020

So there I was, typing away, writing a post about how I’ve been using Puppeteer at work and the post was getting fairly long. I kept coming across things where I thought to myself, “Someone just starting out might get tripped up here 🤔”

I showed the draft to coworkers and they were like, “Yeah this could be broken up into separate posts.”

Thus, I thought I’d make a separate post! 🚀 This one is about getting set up with browser automation, specifically with Puppeteer.

Puppeteer is a Node library which provides a high-level API to control Chrome or Chromium over the DevTools Protocol. Puppeteer runs headless by default, but can be configured to run full (non-headless) Chrome or Chromium.
- https://pptr.dev/

In my opinion, browser automation is a good way to get started with coding. You don’t have to write many lines and once you get a taste of having your code drive a web browser, you’ll be hooked.

Classics like, “Whoa.” and “This could be doing my timesheet for me… 🤔” will flood the ol’ brain-waves as you start to realize all the tedious, manual shit you’ve been doing, that you hate doing, that code could be doing for you 💅

An Analogy

Like driving a car, we don’t necessarily need to know how one works in order to drive one. The same goes with browser automation. As long as we know how to drive, or in this case use Puppeteer, we can go places 🚗

We also use cars because they get us to the store in minutes and can carry way more stuff than we can. I’m just saying, sometimes you need to buy a TV, or it’s raining outside 🤷‍♂ Similarly, Puppeteer will do everything faster than you can, and it won’t forget to do something you’ve told it to do.

What follows should cover everything you need to get up and running, but let me know if I’ve missed something 🤔

Prerequisites for getting started 🛫

First, here is a list of things you’ll need:

✍️ A text editor or IDE for writing JavaScript, like VS Code. There are others but… just use VS Code.
💻 NodeJS installed. Go to nodejs.org, then download and install NodeJS if you haven’t already.
📦 I’ll be using theyarn package manager to install Puppeteer, but you can use npm if you don’t already have yarn or don’t want to deal with installing it. You can install it from https://classic.yarnpkg.com/en/docs/getting-started
🤓 A terminal.
On an Apple computer, you can press Cmd+Spacebar, search for “terminal,” and click the result or press Enter to launch it.

Note: On Windows, you can press Windows-Key and search for “cmd” or “powershell” and use either one.
Since I’m using a mac book, the commands I use for creating folders and navigating the file system will be for the terminal.

Let’s create a smol project

Create a folder called puppeteer-example for our script.

$ mkdir puppeteer-example && cd puppeteer-example

mkdir for “make directory” and cd for “change directory”

Now let’s use yarn to initialize the project and create a package.json file.

If you don’t want to use yarn or deal with installing it, you can use npm init in place of yarn init below, and use npm install in place of yarn add below. Everything else should be the same.

$ yarn init

Run yarn add puppeteer to install it.

$ yarn add puppeteer

Note: at the time of this writing, Puppeteer 3.0 has just come out and is only compatible with newer versions of NodeJS. So you might get an error like this if you already had Node installed but have not updated in a while:

Gif of the installation error you’ll get when installing Puppeteer 3.0 with Node versions older than 10.18.1 — Puppeteer 3.0 installation error with Node versions older than 10.18.1

If this is the case, you could install an older version of puppeteer, but you should probably update Node. You can follow this guide for how to update Node to the latest version for Linux, Mac, and Windows.

Now let’s create a file where we’ll put our code.

$ touch index.js

Pay no attention to the creepy command name; I just work here.

The following will open the current directory in VS Code, but you can use whichever editor you want.

$ code .

Writing 👏 A 👏 Script 👏

First thing’s first. Like that scene in the Matrix where Neo learns kung fu, we need to teach Node how to speak Puppeteer. So let’s open our newly-created index.js file, and pull in Puppeteer.

// index.js
const puppeteer = require('puppeteer');

Now let’s add an async IIFE (Immediately Invoked Function Expression) that will execute when we run the file. We'll put all our Puppeteer stuff in there.

// index.js
const puppeteer = require('puppeteer');(async () => {
    // Puppeteer stuff 
})();

P.S. if you forget to put the semi-colon at the end of the require('puppeteer'); line, you might be in for hours of Googling trying to figure out why you're getting aTypeError: require(...) is not a function error. I imagine this is because the next thing we do is add an opening parenthesis, so Node thinks we're trying to run something likerequire('puppeteer')(...) Anyway... add the semi-colon!

Every command/function in Puppeteer is “asynchronous.” This means when you execute one of them, the function will finish immediately and continue on with the rest of the program before it has gotten the result. So we need to await the Puppeteer functions, and we can only use the await keyword inside of an async function in JavaScript. Rules are rules.

First, we need an instance of a Browser and then we can use it to create a new Page. The puppeteer.launch() function will open an actual browser when we run it, and hand us a back a code object representing that browser.

When we want to hold onto the return-value of a function, we need to assign it to a variable.

await puppeteer.launch()

const browser = await.puppeteer.launch()

The first example discards the return value, but we need it because it has functions we want to use, like newPage()! So let’s add the two lines in bold below.

// index.js
const puppeteer = require('puppeteer');(async () => {    // Puppeteer stuff 
    const browser = await puppeteer.launch({ headless: false });
    const page = await browser.newPage();
})();

Sweet. At this point, we have a Page “object.”

Most of the actions we do in Puppeteer are done on the Page object, not the browser. Technically we can have multiple pages and switch between them, so we need create a new page and work with that.

There is a little more to do before we can run our script. So, onward!

Note: we’re passing the headless option and setting it to false to change the default. Setting headless to false makes the graphical user interface (GUI) actually display so you can see what Puppeteer is doing. By default, Puppeteer will drive the browser without displaying the GUI, because it’s faster and usually unnecessary unless you’re debugging something.

Now we just need to tell Puppeteer to take us somewhere. So let’s do that real quick.

// index.js 
const puppeteer = require(‘puppeteer’);(async () => {    // Puppeteer stuff 
    const browser = await puppeteer.launch({ headless: false });
    const page = await browser.newPage();    // navigate to a page
    await page.goto('https://www.google.com')
})();

Let’s also add a bit of house-keeping to close the browser after our script runs. We’ll wrap what we have so far in a try-finally block so that no matter what happens, we always close the browser.

// index.js
const puppeteer = require('puppeteer');(async () => {    // Puppeteer stuff 
    const browser = await puppeteer.launch({ headless: false });        
    try {
	const page = await browser.newPage();
	
	// navigate to a page
	await page.goto('https://www.google.com')
    } finally {
        console.log('Closing the browser...')
        await browser.close();
    }
})();

Notice that are first line where we run puppeteer.launch() was moved outside the try-finally .

This is because curly braces {} define a scoped “block” of code, and since we want to close the browser within the scope of the finally block, we are moving that line up and outside of the scope of the try block so that it is accessible within the scope of the finally block.

If we hadn’t done this, we would get a ReferenceError like this, and the browser wouldn’t close:

And then the terminal would sit there until you press Ctrl+C to terminate the running process.

Alright well now that I’ve spilled the beans on how to run this thing, let’s run what we have so far and make sure we’re on the right track.

$ node index.js

Neat 🎉

Now what? 🤔

Welp, let’s search for something!

Up until now, we have not actually done any interacting with the page. This part requires us to do a little digging. Keeping the car analogy going, we’ll need to look under the hood of the browser a bit in order to continue.

Bear with me here, as we’re going to get a little technical 😅

Looking Under the Hood 🚗

Puppeteer, and every other browser-automation tool, uses what are called locators or selectors to “locate” elements on the page. There are two steps to creating a locator.

Step 1: Find the Element

First, we can right-click the field on the page and choose the “Inspect” option in the context menu to expose the html element in the html-element hierarchy known as the Document Object Model (DOM). The DOM is basically the structure, or the scaffolding of the page.

Gif of right-clicking the search field on Google.com and clicking “Inspect” to show the input element in the DOM inspector.

This not-so-secret window is known as Developer Tools. It’s used a lot by software developers and people like me who work on software development teams to test and make sure things are working as expected.

From within the Elements tab of the dev tools window, we can press Cmd+F (Ctrl+F on Windows) to expose the search field where we can test out our selectors.

The element we’re interested in is the <input... > tag that is highlighted when the window opens. The browser highlights it since this is the element we right-clicked before clicking Inspect.

Screenshot of the DOM after right-clicking the search field on the Google homepage, and clicking “Inspect”

Step 2: Write the Selector

Selectors are bits of text that browsers understand which find, or “select,” an html element / set of html elements in the current page.

There are two main types of selectors:

CSS selectors (Cascading Style Sheets)
XPaths (XML Path Language)

I usually use CSS selectors so let’s go with that.

Now, looking at the <input> element in the DOM, we need to use what we see to construct our selector so that it only selects this element. With CSS selectors, we can use the name of the html tag to select all elements in the DOM with that tag, which in this case is input.

Screenshot of the DOM inspector after searching for “input” and getting 14 results. — 14 results when searching the DOM for “input”

For better or for worse, the search not only runs selectors but also does text search. So, you’re gonna get every instance of “input” in the DOM, which is no help.

Not only that, but now we’ve lost where our element was 😤 We can get it back by right-clicking it and choosing “Inspect” again in the page.

Okay, so… that didn’t work. We need a selector that’s more specific. We’ll have to use the “attributes” of the input element to narrow down the results from 14, to just the one.

Hey, look at that, it has an attribute called name with a value of q, perhaps for “query” 🤔

The way we select an element based on one of its attributes is by enclosing the attribute and value in square brackets: input[name="q"] If we try typing this into the element search, we see that the browser highlights our target input element in green. We have a match!

Screenshot of the DOM inspector with input[name=”q”] in the search and the targeted input element highlighted green.

Tell Puppeteer the Good News

Now that we know our selector works, we can tell Puppeteer how to find the search field and enter searches for us.

I’m gonna copy the original index.js file into an example2/ folder to keep it as a separate version.

So, now our example2/index.js file looks like this:

const puppeteer = require('puppeteer');(async () => {    // Puppeteer stuff
    const browser = await puppeteer.launch({ headless: false })
    try {
        const page = await browser.newPage()        // navigate to a page
        await page.goto('https://www.google.com')
        await page.type('input[name="q"]', 'tesla stock')
        await page.keyboard.press('Enter')
        await page.waitForNavigation()
    } finally {
        console.log('Closing the browser...')
        await browser.close()
    }
})();

We tell Puppeteer to type 'tesla stock' into the input[name="q"] field with the page.type() function.

Then we tell Puppeteer to press Enter.

Then, since a new page is going to load, we tell puppeteer to page.waitForNavigation()

Let’s run it! 🏃

$ node example2/index.js

Gif of Puppeteer launching the browser, navigating to Google.com, searching for “tesla stock” and pressing Enter. — Puppeteer google-ing “tesla stock”

This happens to work, but it is generally a good idea to make your automation more robust by adding “waits” before interacting with elements. If your script doesn’t need it, then great! But if it’s not succeeding consistently, then it might be a sign that you need to wait for the element to be visible before trying to interact with it.

const puppeteer = require('puppeteer');(async () => {    // Puppeteer stuff
    const browser = await puppeteer.launch({ headless: false })
    try {
        const page = await browser.newPage()        // navigate to a page
        await page.goto('https://www.google.com')        const inputSelector = 'input[name="q"]'
        await page.waitForSelector(inputSelector, { visible: true })
        await page.type(inputSelector, 'tesla stock')
        await page.keyboard.press('Enter')
        await page.waitForNavigation()
    } finally {
        console.log('Closing the browser...')
        await browser.close()
    }
})();

The other change I made above was to save the selector in the inputSelector variable so that it can be reused in both the page.waitForSelector() function, and the page.type() function. Don’t repeat yourself! The last thing you want is for the selector to stop working and then have to update it in a million places. If you save it to a variable first, then if it changes you just update the variable and everything else gets the change for free.

The page.goto() function also takes an optional “options” object as its second argument where you can tell Puppeteer to wait for all the browser requests to complete before continuing. (More on this in the page.goto() docs.)

await page.goto(url, { waitUntil: 'networkidle0' })

Another thing you can do if you’re planning to use Puppeteer to automate tasks, is remove the { headless: false } options object from the call topuppeteer.launch(). This way Puppeteer will go back to using the default mode, which is headless. Your script will run without having to actually display the browser, and will also run more quickly.

We could go on from here with page.click(), but… I think you get the idea.

You can look at the puppeteer documentation see all the functions Puppeteer Page objects have available.

The best tools, in my humble opinion, tend to let you describe what you want to do, rather than make you describe how to do it. Puppeteer seems to do a good job of this. My experience using it has been largely positive, but I’m curious what others think and if/how others are using it.

Anyway, hopefully this is helpful. I’m thinking about making a YouTube video for this 🤔 Hit me up here, or on Twitter @ramojol if I left something out, made a mistake, or if you have questions 😅 🍻