Caching Embedded JavaScript Applications in Server-Rendered Pages

April 22, 2018 0 Comments

Caching Embedded JavaScript Applications in Server-Rendered Pages

 

 

Exploring and optimizing caching behavior of embedded JavaScript applications.

Today we all are writing JavaScript single page applications; renders the whole screen. At the same time it is not uncommon to simply enhance a portion of an existing server-rendered page, e.g., served from WordPress, by embedding a JavaScript application (providing some specific interactive feature).

First, I differentiate this approach (embedding a JavaScript application) from using an iFrame to embed a HTML document (with JavaScript). The critical problem with the iFrame solution is that there are severe limitations on setting the size (height in particular) of the iFrame that might preclude their use.

So, what I mean by embedding a JavaScript application is using link and script tags to load the application’s CSS and JavaScript respectively.

We are going to explore strategies for caching (and cache-busting) these embedded JavaScript applications. In order to adopt these strategies, however, we will need to serve the application files from a web server were can control the HTTP headers. One easy and cost-effective solution is using an Amazon S3 bucket as described in Configuring a Bucket for Website Hosting.

The examples below are available for download.

Straw Man Solution

Our first application makes no effort at managing caching; just relies on the browser’s default caching behavior.

While this article’s JavaScript application is trivial, most are complex applications comprised of multiple source files that are bundled together for deployment. As such, we will also be bundling (using webpack) together multiple source files for illustrative purposes.

The three source files consist of the third-party lodash library and two additional files.

straw/src/index.js

import _ from 'lodash';
import './style.css';
function component() {
var element = document.createElement('div');
element.innerHTML = _.join(['Hello', 'webpack'], ' ');
return element;
}
document.body.appendChild(component());

straw/src/style.css

body {
background-color: yellow;
}

For deployment we want to build all these files together into a minimal number of files. At the same time, to avoid the dreaded flash of unstyled content (FOUC), we split out the CSS and ensure that it is loaded before the JavaScript. In the end we build two files; main.js containing all the JavaScript and main.css containing all the CSS.

We use a fairly straightforward webpack configuration; derived from following the following documentation (if you were wondering).

straw/webpack.config.js

const MiniCssExtractPlugin = require('mini-css-extract-plugin');
const path = require('path');
module.exports = {
entry: './src/index.js',
module: {
rules: [
{
test: /.css$/,
use: [
MiniCssExtractPlugin.loader,
'css-loader',
],
},
],
},
output: {
filename: 'main.js',
path: path.resolve(dirname, 'dist'),
},
plugins: [
new MiniCssExtractPlugin({
filename: "[name].css",
}),
],
};

In the straw project, the following command builds the main.js and main.css file in the dist folder.

npm run build

We then copy these two files to our web server (say an AWS bucket) and reference them from our server rendered webpage (say a HTML file on a laptop).

index.html (on laptop)

<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>My Scout</title>
<link rel="stylesheet" type="text/css" href="https://s3.amazonaws.com/my-scout/main.css">;
</head>
<body>
<div>Sample existing page content</div>
<script src="https://s3.amazonaws.com/my-scout/main.js"></script>;
</body>
</html>

note: The library http-server is a quick and easy way of running a local test server (to serve up the index.html file); there are options to control the caching behavior.

Observations

  • With no cache control headers, the browser is left up to its own to determine how long to cache the web page itself (index.html) and the two application files (main.js and main.css); some documentation suggests that the default caching behavior is five minutes.
  • If the caching behavior is too short, browsers will have to repeatedly download the application files (often large files).
  • If the caching behavior is too long, browser will be delayed in getting application updates.
  • As a reminder, because we split out the CSS from the JavaScript bundle, we benefit from having the CSS and JavaScript download in parallel (in figure below, purple and orange blocks on top overlap) and yet evaluate in series with the CSS first (the purple block before the orange block below). Even if the CSS file took longer than the JavaScript file to download, the CSS would be evaluated first due to CSS being render blocking. This is a good thing as it prevents the dreaded FOUC.

Better Solution

A better solution is to have the application files cached until we want them updated. To accomplish this, we set the application files to be cached indefinitely and the web page itself to never cache. By renaming the application files (and updating the web page) we can update the application whenever we desire.

We update our webpack configuration to generate new build filenames when the source is changed.

better/webpack.config.js

  output: {
filename: 'main.[hash].js',
path: path.resolve(
dirname, 'dist'),
},
plugins: [
...
new MiniCssExtractPlugin({
filename: "[name].[hash].css",
}),
],
};

note: This build will generate new filename if anything in the build changes; using chunkhash is an optimized and preferred solution (didn’t want to complicate the example here).

Also, because we are generating new file names, we will want to install and use clean-webpack-plugin to recreate the whole dist folder on every build.

better/webpack.config.js

const CleanWebpackPlugin = require('clean-webpack-plugin');
const MiniCssExtractPlugin = require('mini-css-extract-plugin');
...
plugins: [
new CleanWebpackPlugin(['dist']),
new MiniCssExtractPlugin({
filename: "[name].[hash].css",
}),
],
...

In the better project, the following command builds the main.f38845c649a0c6950133.js and main.f38845c649a0c6950133.css files in the dist folder.

npm run build

We then copy these two files to our web server (say an AWS bucket). We then add the cache control-headers to indefinitely (a year) cache them. We then update our server rendered webpage (say a HTML file on a laptop) and ensure that the index.html is never cached (server configuration).

index.html (on laptop)

<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>My Scout</title>
<link rel="stylesheet" type="text/css" href="https://s3.amazonaws.com/my-scout/main.f38845c649a0c6950133.css">;
</head>
<body>
<div>Sample existing page content</div>
<script src="https://s3.amazonaws.com/my-scout/main.f38845c649a0c6950133.js"></script>;
</body>
</html>

Observations:

  • We have achieved our goal of being able to manage the caching (and cache busting) of the application files.
  • By setting server rendered page to never cache, we have introduced a new problem however; this page might be rather lengthy and thus take awhile to download each time.
  • Also, to get this to work, we have to update the server rendered page with the updated links to the application files each time we update the application (this is an error-prone pain).

Another Solution

The idea behind this solution is to create a small JavaScript file (will refer to as a scout file) that dynamically loads the application’s JavaScript and CSS files. We ensure that this filename (URL) never changes and is never cached. The server rendered page references this file.

note: Inspiration for approach is drawn from the scoutfile library. As this project is a bit stale and is complex (has a lot of features), I thought to pursue a simpler solution.

We want maintain the benefit of downloading the CSS and JavaScript files in parallel and evaluating them serially with the CSS file being evaluated first (thus avoiding FOUC).

In general, dynamically loading these file amounts to building and inserting the link and script tag through JavaScript; seemingly straightforward.

note: One feature I learned, the hard way, is that you have to set the attribute async to false on the script tags in order to maintain the behavior of having the JavaScript files (if there are more than one) to load serially.

The big problem, however, that I ran into in my exploration is that there seems to no way to force the JavaScript to serially evaluate after the CSS file (and still have both files load in parallel). But, you can (through callbacks) know when the files are evaluated by listening to their load events.

This forced me to change up my JavaScript code that rendered content. The earlier version renders to the DOM when it is evaluated (where I cannot be sure that the CSS file has been alreeady evaluated). The new version sets a function on the window object that renders the application when called.

another/src/index.js

import _ from 'lodash';
import './style.css';
function component() {
var element = document.createElement('div');
element.innerHTML = _.join(['Hello', 'webpack'], ' ');
return element;
}
window.renderApplication = function() {
document.body.appendChild(component());
}

With this in place, we can now call window.renderApplication once both the CSS and JavaScript are evaluated.

While I was writing the implementation of this scout file, I found an existing library (LoadJS) that did exactly what I trying to do.

I first built and uploaded (to my S3 bucket) the two files (e.g., main.76310564c34c2af2ceeb.js and main.76310564c34c2af2ceeb.css).

I then wrote a scout file and also uploaded it.

another/public/scout.js

'use strict';
(function () {
var style = 'https://s3.amazonaws.com/my-scout/main.76310564c34c2af2ceeb.css&apos;;
var script = 'https://s3.amazonaws.com/my-scout/main.76310564c34c2af2ceeb.js&apos;;
loadjs([style, script], {
success: function() {
window.renderApplication();
},
async: false
});
})();

note: It is rather unsavory to have to hard-code the bucket’s URL in both the HTML (below) and the scout file (have a idea how to fix this; will do later).

I then changed up the files on my server rendered page (actually my laptop) as follows:

index.html (on laptop)

<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>My Scout</title>
</head>
<body>
<div>Sample existing page content</div>
<script src="loadjs.min.js"></script>
<script src="https://s3.amazonaws.com/my-scout/scout.js"></script>;

</body>
</html>

note: I also downloaded loadjs.min.js to be alongside the index.html file.

In order to do detailed tests to determine if I achieve my goal, I stuffed a large amount of CSS (uses Bootstrap) at the top of the CSS file. Using Chrome Developer tools, I also simulated a very slow network connection to be able to tease out important behavior.

First, you can see in this example that both the CSS and JS files are downloading in parallel (top purple and orange bars); with the CSS taking longer. We also see that the JavaScript is evaluated before the CSS is evaluated (the orange and purple bars in the bottom part).

Zooming into the tail end of this report, we can see the following events (in order).

  • CSS is downloaded (purple).
  • CSS is parsed (blue)
  • The success function (in scout.js) is evaluated (orange)
  • The renderApplication function (in main.76310564c34c2af2ceeb.js) is evaluated (orange)
  • A number of render (ending in a paint in green) operations happen.

The key point is that the renderApplication operation happens after the parsing of the CSS.

Fixed

To address the problem of duplicating the base URL in both index.html and scout.js, we set a data attribute:

index.html (on laptop)

<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>My Scout</title>
</head>
<body>
<div>Sample existing page content</div>
<script src="loadjs.min.js"></script>
<script id="scout" data-base-url="https://s3.amazonaws.com/my-scout/"; src="https://s3.amazonaws.com/my-scout/scout.js"></script>;
</body>
</html>

And use the data attribute to retrieve the base URL.

fixed/public/scout.js

'use strict';
(function () {
var STYLE = 'main.76310564c34c2af2ceeb.css';
var SCRIPT = 'main.76310564c34c2af2ceeb.js';
var scoutEl = document.getElementById('scout');
var baseUrl = scoutEl.dataset.baseUrl;
var style = baseUrl + STYLE;
var script = baseUrl + SCRIPT;

loadjs([style, script], {
success: function() {
window.renderApplication();
},
async: false
});
})();

Wrap Up

We have achieved our goal of embedding our JavaScript application that:

  • Allows one to cache the server-rendered page as one sees fit
  • Doesn’t require repeated editing of the file / code generating the server-rendered page
  • Has optimized (parallel) downloading behavior
  • Has optimized caching behavior (use chunkhash as described above for optimal)
  • Avoids the dreaded FOUC

There a couple of pain point with this solution:

  • Every change to application requires hand-editing scout.js (updating the CSS and JS variables) and uploading it along-side the application files
  • If your build generates another JavaScript bundle file, say a vendor bundle, you need to ensure to hand-edit the scout.js.

At this point it is pretty obvious that we need to automate the build of the scout file before we use this in production. I am going to use this problem as inspiration for learning to write a webpack plugin (and will write about it).

Addendum

Loosely based on HTML Webpack Plugin, I wrote a webpack plugin (scout-webpack-plugin.js) delivered along-side a sample application using it that automates the build of the scout file.

In order to get this sample to work, you will need to run the following in the folder:

npm run install
npm run build

You then copy all the files in the dist folder to a server of your choice, e.g., AWS S3 bucket. All the files except scout.js should be cached indefinitely and scout.js should not be cached at all.

We then can embed the application into our server-generated HTML (alongside the loadjs.min.js file).

index.html (on Laptop)

<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>My Scout</title>
</head>
<body>
<div>Sample existing page content</div>
<div id="root"></div>
<script src="loadjs.min.js"></script>
<script id="scout" data-base-url="https://s3.amazonaws.com/my-scout/" src="https://s3.amazonaws.com/my-scout/scout.js"></script>
</body>
</html>

Observations:

  • Because I wanted to ensure that this plugin properly handled media (images) and dynamic imports, I swapped the application for one that uses them.
  • Ended up updating the webpack.config.js to handle media (images), output names based on chunkhash, and (obviously) use scout-webpack-plugin.js.
  • The introduction of dynamic imports requires the use of the Public Path (on the Fly) in order for bundles to be resolved. This is implemented in the entry point for the application (index.js) file.

At this point, I do not have plans to turn this plugin into a npm installable package; a bit too much work to support this on my own.


Tag cloud