In one of my previous articles I explained how we can use Lighthouse to put our website on a budget and why monitoring your website performance its an important aspect of web development. In this article, we will go deeper into Lighthouse building blocks, its architecture and learn how we can start auditing and collecting custom metrics for our web pages.

Lighthouse is an audit tool developed by Google which collects different metrics of your website. After collecting the metrics it will present a series of scores for your webpage. The scoring is divided into five main auditing areas.

Lighthouse audit

Besides the scoring of your webpage, Lighthouse provides more detailed information on where to focus your efforts, what is called "opportunities". Areas of your website which impact may lead to big performance improvement as an example the following application execution time are quite high.

Lightuhouse opportunities

Lighthouse architecture

Lighthouse
architecture is built around Chrome Debugging
Protocol
which is a set of low-level API to interact with a Chrome instance. It interfaces a Chrome instance through the Driver. The Gatherers collect data from the page using the Driver. The output of a Gatherer is an Artifact, a collection of grouped metrics. An Artifact then is used by an Audit to test for a metric. The Audits asserts and assign a score to a specific metric. The output of an Audit is used to generate the lighthouse report that we are familiar with.

Lighthouse architecture from https://github.com/GoogleChrome/lighthouse/blob/master/docs/architecture.md#components--terminology

We will now take a close look at two of the Lighthouse building blocks by creating a simple audit tracking the internal rendering of a webpage. Creating an application is necessary in order to include a custom Gatherer or Audit since it is not possible to add any custom Gatherer or Audit directly into the Chrome panel.

Let's create our project and install lighthouse as a dependency

mkdir custom-audit && cd custom-audit npm i --save lighthouse

To start auditing our website we will then create a new file scan.js where we will import Lighthouse and start scanning the webpage of choice. We will use programmatic access to Lighthouse by importing it inside our project

const lighthouse = require('lighthouse'); const chromeLauncher = require('chrome-launcher'); async function launchChromeAndRunLighthouse(url, opts, config = null) { const chrome = await chromeLauncher.launch({chromeFlags: opts.chromeFlags}); opts.port = chrome.port; const { lhr } = await lighthouse(url, opts, config); await chrome.kill() return lhr;
} const opts = {}; // Usage:
(async () => { try { const results = await launchChromeAndRunLighthouse('https://izifortune.github.io/lighthouse-custom-gatherer', opts); console.log(results); } catch (e) { console.log(e); }
})();

If we now try to run our file we should be able to see the results coming from a lighthouse scan in the console:

node scan.js 

Now that we have a project with Lighthouse up and running we can start looking at how a Gatherer works and how we can use it in our project. We will use a webpage that I’ve created for this demo. In the page, I’m fetching todo list items from an API and rendering on the page. I’m measuring the action using PerformanceAPI as follows:

const getDataFromServer = async () => { performance.mark('start'); const todos = await getTodos(); renderTodos(todos); performance.mark('end'); performance.measure('Render todos', 'start', 'end'); const measure = performance.getEntriesByName('Render todos')[0];
}

Gatherer

A Gatherer is used by Lighthouse to collect data on the page. In fact, any data that is currently needed to perform the default lighthouse audits is collected through a Gatherer. We can extend the Gatherer base class and start creating custom ones:

const { Gatherer } = require('lighthouse'); class MyGatherer extends Gatherer { ...
}

The class Gatherer defines three different lifecycle hooks that we can implement in our class:


  • beforePass - called before the navigation to given URL

  • pass - called after the page is loaded and the trace is being recorded

  • afterPass - called after the page is loaded, all the other pass have been executed and a trace is available

A lifecycle hook is expected to return either directly an Artifact or a Promise which resolve to the desired Artifact. Depending on what data are we looking to collect from the Driver and at what time we can use any of the hooks just described.

Let’s now create a custom Gatherer which will collect the measurements from the PerformanceAPI. The Gatherer needs then to collect entryType measure using a PerformanceObserver. We will proceed to create the file todos-gatherer.js

'use strict'; const { Gatherer } = require('lighthouse'); function performance() { return new Promise((res) => { let logger = (list) => { const entries = list.getEntries(); window.todosPerformance = entries[0].duration res(entries[0].duration); } let observer = new PerformanceObserver(logger); observer.observe({ entryTypes: ['measure'], buffered: true }); });
} class TodosGatherer extends Gatherer { beforePass(options) { const driver = options.driver; return driver.evaluateScriptOnNewDocument((${performance.toString()})()) } afterPass(options) { const driver = options.driver; return driver.evaluateAsync('window.todosPerformance') }
} module.exports = TodosGatherer; 

Inside TodosGatherer we are using both the beforePass and afterPass hook to contact the Driver and then execute a javascript function inside the context of the current page returning a promise. Inside the beforePass we are registering a PerformanceObserver just after the page will load, since the observers are not buffered we might encounter in a race condition. In the afterPass then we collect the previously registered measure. To get an idea of all the methods that you use on the driver object you can have a look


here.