Component Level Isomorphic Webpack Code-Splitting

October 26, 2017 0 Comments

Component Level Isomorphic Webpack Code-Splitting

 

 

Webpack code-splitting is a standard way of increasing the performance of your React SPA. While this is a powerful and effective technique, up until recently code-splitting was not without limitations — most significantly the inability to effectively leverage code-splitting isomorphically. Happily, these limitations have finally been overcome. You can now code-split painlessly and granularly at the component level while rendering your isomorphic application seamlessly. It’s a huge milestone for the React / Webpack ecosystem.

In this article we will cover how the Discovery Digital Media Client Engineering Team uses a suite of NPM packages collectively called React Universal Component to allow our React SPA user interface to be data-driven from a CMS — while still supporting server side rendering. Our React components such as heroes and carousels are downloaded and rendered on-demand into our app as separate dynamically code-split bundles. This allows our product teams to shape the user experience of our site in the CMS while allowing the engineering team to deliver the smallest bundle possible to kickstart it all.

Note: The name universal components is also used to refer to React components that can work identically in either React or React Native (perhaps a better name for those types of components is hybrid components). That is not what we are discussing here. Universal components in the context of this article describes React-specific dynamically code-split components that are isomorphic — where the handoff from rendering them synchronously on the server to (a)synchronously on the client is seamless.

Universal components are a relatively advanced code-splitting technique, but have no worry — even if you are new to code-splitting, will we first cover some history and preliminary foundational knowledge to help you get up to speed. Engineers familiar with code-splitting can bypass or skim this preliminary material and find the heart of the technique later on in the article.

Front-end engineering has come a long way from manually piecing together HMTL, CSS, and JavaScript using <link> and <script> tags. With the advent of SPAs (single page applications), front-end development has become increasingly sophisticated. Transpilers such as Babel, build tools such as Webpack, and new progressive web techniques increasingly blur the line and developer experience between web and native development.

As Tom Dale recently wrote in Compilers are the New Frameworks:

Increasingly, the bytes that get shipped to browsers will bear less and less resemblance to the source code that web developers write.
In the same way that a compiled Android binary bears little resemblance to the original Java source code, the assets we serve to users will be the aggressively-optimized output of sophisticated build tools. The trend started by minifiers like UglifyJS and continued by transpilers like Babel will only accelerate.

Let’s talk about one such important build tool, Webpack. Webpack is a module bundler. In it’s simplest usage, you give it a single JavaScript file which serves as the entry point for your application. Webpack then traverses this entry point file’s dependency graph (all of it’s imports and all of the imports of those imports, and so on). It then assembles all of the code contained within that dependency graph into a single output JavaScript file that Webpack calls a bundle. A clear advantage of this approach is that it automates what was previously an error prone process — manually ordering and inserting <script> tags into a HTML document. The automation that Webpack provides is not only a improvement in method — more importantly it is a change of conceptualization and form. No longer are web sites collections of linked documents each containing a manually specified list of required scripts and CSS, but they are now holistic bundled units — single page applications. Use of a build tool not only automates and improves this previous process, but more importantly it fosters the evolution of new levels of innovation. Since we are using build tools to build applications instead of manually assembling them, we can add intelligence to those build tools to optimize our applications in increasingly sophisticated ways and for even better user experiences.

From entry point Main JS a dependency graph is built by Webpack and transformed in an output bundle.

While Webpack’s automated creation of a bundle from a entry point’s dependency graph is a significant improvement over a collection of manually added <script> tags, it is not without issue. Because the bundle encompasses the entirety of the dependency graph, the larger the code/functionality of the application, the larger the dependency graph and therefore the larger the bundle. Unoptimized, this single monolithic bundle contains every screen, every modal, every feature, and finally every NPM package required to make it all work. That can yield a bundle approaching or exceeding a megabyte — resulting in a slower download, degraded user experience, and greater distribution costs.

We mentioned that using a tool like Webpack to assemble code and assets into an application bundle resembles the similar production of an executable that occurs in native development. However there is a critical difference between the two. With a native app, the user will find it normal and acceptable to download the application from an App Store before they can begin using it. Web users on the other hand have different expectations — webpages are expected to load quickly. Websites that load slowly are heavily penalized by user attrition and additionally in Google’s page ranking. Because of this critical difference, front-end web developers need to be more sophisticated about how we deliver our application and it’s various assets. A single large monolithic Webpack bundle, while convenient from an engineering development standpoint, unfortunately will not provide users with an experience they expect and deserve. The requirement to download a large bundle to use an single page web application, not only on mobile networks which are slower and where bandwidth is more costly, but even on the desktop over wi-fi, is a non-starter for world-class web applications.

Luckily Webpack has a solution. That solution is called code-splitting. Code-splitting in basic terms allows you to divide your monolithic Webpack bundle into multiple smaller bundles which can be loaded on demand. Borrowing from the Webpack documentation, there are 3 general code-splitting methods:

1) Entry Points: Manually split code using entry configuration.
2) Prevent Duplication: Use the CommonsChunkPlugin to dedupe and split chunks.
3) Dynamic Imports: Split code via inline function calls within modules.

We’ll focus on methods 1 and 3 since those are most relevant to our discussion of Universal Components.

Entry points are relatively straightforward. Entry point code-splitting can be considered a “compile-time” approach — entry points are statically hard-coded in your Webpack configuration (mostly). For entry point code-splits, every entry point generates a single output bundle. These individual output bundles then must be added via <script> tags to your document.

A canonical example of using entry points to split and slim down a monolithic application bundle is to split out nodemodules and other non-packaged JavaScript code dependencies into a separate vendor bundle. A single monolithic bundle, because it is a traversal of a dependency graph, will include not only your application specific code, but will also include all of the various libraries and frameworks used by the application. Analyzing our codebase, we can make note of the fact that the libraries and frameworks in nodemodules used by our application will change less often than the application itself (where features are continually added). Therefore, it is advantageous to split the monolithic single bundle into 2 bundles: one for the application specific code and one for the libraries and frameworks used by the application (the vendor bundle). In doing so, the less frequently changing library and frameworks in the vendor bundle can be stored in the user’s browser cache, speeding overall application delivery and start-up time when revisiting your site.

While entry point code-splits are useful, because they are hard-coded, they are more limited in possibility than the next code-splitting method we will discuss — dynamic imports.

Dynamic imports offer “run-time” code-splitting. Instead of statically specifying split points within the Webpack configuration itself using entry points, dynamic imports are split points that you write directly in your JavaScript code using Webpack-specific (for now) function calls. Unlike entry point bundles which are output from the build process and then manually inserted by the developer into the page as script tags, dynamic imports are lazy-loaded when the code referencing them is run. This offers significant flexibility. Screens, modals, and features can be loaded on demand at the time they are needed based on the user’s usage of your application. Like a video game that loads new levels as the character navigates the virtual world, your application loads screens as the user requires them — not only in response to directly clicking links within the app, but even advanced techniques such as preloading bundles based on anticipating user’s actions.

Dynamic code-splitting is done within Webpack by using import/require.ensure. When Webpack parses your code, it looks for import/require.ensure and generates a code-split. The code required by the function will become a new bundle and the import/require.ensure itself will be transformed into a AJAX/JSONP call.

Many tutorials on dynamic import code-splitting illustrate the technique using individual URL routes as split points. This is for good reason — in both traditional document oriented websites and in single page applications — URLs typically represent different screens of UI elements and functionality. Because users will likely only visit a subset of screens within a website or application, splitting at URLs ensures that the user only downloads the code for the screens that they need and nothing more. For many SPAs, splitting at smaller units of functionality than routes — such as individual components — can be excessive or even counter-productive (i.e. when not using http2), therefore this chunkier level of granularity — code-splitting at routes — is entirely sufficient. But for other types of sites, particularly those CMS-driven such as Discovery’s TV Network websites, being more granular — having dynamically loaded code-split components — has significant advantages.

Our CMS, in addition to containing typical domain object data such as shows and episodes, also contains layout data. Our application UI is not hard-coded, but instead data-driven. Each page has a layout which specifies a list of what components appear on that page as well as the data that each component needs. For example our home page layout data specifies that it has a HeroCarousel component and supplies a list of TV show data that should be be cycled through when the HeroCarousel renders.

Nearly every element on this page is data-driven from our CMS

Since our layout is data-driven we do not know what components we need to render until we get layout data. Therefore there is significant benefit from code-splitting individual components as dynamic imports. Prior to switching our SPA codebase to Universal Components, we included in our main application bundle every possible React component that the CMS supported (as well as all A/B variants of those components). Including all of those components in our application bundle bloated download size and therefore detrimentally impacted our user experience. Alternatively, if we constrained the CMS to only support a limited palette of components we could keep our monolithic bundle leaner but would negatively impact our producers ability to craft a user experience. Both are undesirable.

Therefore code-splitting at the granularity of components is the perfect solution. Except for one big problem…

For a single page application that only uses React client side rendering, the benefits of code-splitting are so good there’s little reason not to use it. However, for isomorphic apps which also do server side rendering (SSR) — for SEO reasons for example — code-splitting doesn’t work out of the box. On the server, we use React’s renderToString to generate HTML from React components. The issue is that renderToString is a synchronous call. Dynamic code-split bundles however are asynchronously loaded. This mismatch between a synchronous render call and asynchronous component loading can not be reconciled. Therefore a different mechanism needs to be devised. This gives rise to a healthy and enjoyable technical challenge.

In fact, the React Router team in wrote about this specific issue:

We’ve tried and failed a couple of times. What we learned:
You need synchronous module resolution on the server so you can get those bundles in the initial render.
You need to load all the bundles in the client that were involved in the server render before rendering so that the client render is the same as the server render. (The trickiest part, I think its possible but this is where I gave up.)
You need asynchronous resolution for the rest of the client app’s life.
We determined that google was indexing our sites well enough for our needs without server rendering, so we dropped it in favor of code-splitting + service worker caching. Godspeed those who attempt the server-rendered, code-split apps.

The compromised (non)solution they ultimately came up with was to eliminate SSR all together. But thanks to Universal Components, this “trickiest part” is now solved, a real solution is at hand.

To solidify our understanding, let’s break down the technical problems that need to be addressed first, before describing how Universal Components solves them.

The initial problem to overcome is this:

  • We have a SSR React app, which uses React’s synchronous renderToString to render components
  • We want our components to be code-split, which is asynchronous. Code-splitting works on the client, but will not work on the server with a synchronous renderToString.

An easy solution to this initial set of problems was proposed and an example github project was created over 2 years ago by Ryan Florence, the co-author of React Router.

The concept and solution is simple. On the server, we patch Webpack’s asynchronous require.ensure and make it work like CommonJS’s synchronous require. On the client, we keep Webpack’s require.ensure which is asynchronous. It’s a pretty standard isomorphic approach — determine which environment you are in (server or client) and conditionally branch functionality to mask server/client differences. While the solution is elegant, there are two limitations to the approach. First, it is tied to the implementation of a earlier version of React Router (although conceivably this can be overcome). Second and more importantly, the approach only works at the route level. Code-splitting on a more granular level —for individual CMS-driven components for example — will require something more sophisticated.

One reason more granular code-splits require a more advanced solution is that when creating Isomorphic React apps, the most important requirement is to ensure that what is generated on the server matches what is generated on the client for the initial render. If there is a mismatch between the two, you will get the dreaded:

Warning: React attempted to reuse markup in a container but the checksum was invalid (…):

When this checksum mismatch occurs, the advantages of SSR are lost. React will completely discard what you created on the server and recreate it on the client. This will significantly slow the time needed before the user can interact with your application.

If you use Ryan’s solution, but code-split individual components instead of routes, you will encounter a checksum mismatch. The reason for this is that on initial client render the asynchronous code-split component is not there — it is being downloaded. While it is being downloaded, nothing will be rendered for that missing component (a placeholder loader component could be rendered instead, but the overall mismatch problem would remain). When React compares the server render output which has a fully rendered component with the client render output where the same component is either empty or a placeholder, the overall renders don’t match, and React discards the server render, creating a significant performance degradation.

Therefore the the next problem to overcome is:

  • The server rendered HTML must match the client rendered HTML to avoid checksum mismatch, however client rendered components are not available on initial render since they are asynchronously loaded. Therefore the renders don’t match and the server side render is discarded.

We need a way for the initial client render to match the server side render. As the React Router team noted:

You need to load all the bundles in the client that were involved in the server render before rendering so that the client render is the same as the server render. (The trickiest part, I think its possible but this is where I gave up.)

So to solve the previous problem, we actually need to solve:

  • We need to know what components were rendered on the server
  • We need to find out which Webpack dynamically code-split bundles contain those components
  • We need to add those server rendered components code-split bundles script tags to our HTML page so the components are available prior to the initial client render and therefore render synchronously.

And finally:

  • After the initial client render, all dynamically code-split component bundles should be loaded asynchronously as usual.

Wow, that’s a lot to handle. But that’s exactly what Universal Components does!

To make this magic work, Universal Components uses a suite of packages. The use of multiple packages is the authors design choice to avoid a “framework” approach whereby you are locked into a particular emcompassing solution. This flexibility adds complexity. Therefore, for people who want an all-in-one approach or who would benefit from another take on the problem space that Universal Components addresses, a similar package that solves many of the same issues (but is much more limited) should look at the package react-loadable.

Let’s examine each of the packages that comprise Universal Components and how they work together.

In this article we cover Universal Components as a high level introduction. You may want to take a moment to clone the Universal Components demo repo by the author of Universal Components @faceyspacey to augment our high level intro with the code in his demo.

react-universal-component provides a exported function call universal which wraps your existing components and gives them isomorphic superpowers. And that’s the key — because it wraps components, all of your existing components will work with it — you don’t have to change them. Because it is a wrapper you can bring Universal Components into your application and evaluate it in a trial fashion.

Let’s look at a basic call of the function which creates a React universal component, illustrating it’s functionality with a minimal number of arguments:

const UniversalComponent = universal(
props => import(./components/${props.name}),
);

This simple call masks a great deal of power. First, note that we are using import, so we are creating a Webpack code-split. But this is a unique type of Webpack code-split — we are using an expression, not a hardcoded string. Seeing an expression in the import tells you this is Webpack not ES6. The component we ultimately import is parameterized by the name prop passed to UniversalComponent. When webpack encounters an expression that is a dynamic import such as this, it treats it like a wildcard and generates code-splits for all files that match that glob generating a bundle for each split. Each bundle contains the code of a single React component. Therefore we can universally load any other component under ./components with this one master React universal component!

To learn more about dynamic Webpack imports with expressions and how they relate to universal components, see “React Universal Component 2.0 & babel-plugin-universal-import”

The universal React component created with the above function call can then be used in the following way:

<UniversalComponent name="MyComponent" />

MyComponent, which is under ./components, was code-split into its own Webpack bundle. As a universal component (and using the additional packages described below) it will then be synchronously loaded and rendered on the server and initial client render and asynchronously rendered thereafter.

Now that we’ve shown what react-universal-component does and how it is called, let’s cover one internal detail of how it works. When the universal component’s componentWillMount lifecycle method is called on the server its Webpack module name/id will be added to an internal array. That way when rendering is complete, there is a list of all universal components which where used. If you are familiar with react-helmet, the technique is very similar — record and flush. This internal array will come in handy for the next package in the Universal Components family, webpack-flush-chunks.

We said above that when react-universal-component is executed on the server that it records a list of all of the universal components which were rendered. This list of rendered server components is used by webpack-flush-chunks. Knowing what universal components were rendered, webpack-flush-chunks responsibility is to find out what Webpack bundles house those components.

How does it do that? The key to this comes from Webpack. When Webpack runs, it can be configured to output a statistics JSON file in addition to it’s normal output. This statistic file contains the mapping data between the code-split points established with dynamic imports to their output bundle file names.

Webpack-flush-chunks cross references the list of components collected during server render by react-universal-component with the information within the Webpack statistics file. By doing this, webpack-flush-chunks knows what Webpack bundles files must be added as script tags to our server rendered HTML page. webpack-flush-chunks then provides functions (flushChunks/flushFiles) to get this data. This is the key to making the initial client render of universal components synchronous. Let’s explain how.

By default, webpack-flush-chunks flushChunks gives you script tags to add to your HTML page in the following order:

  • bootstrap: the Webpack runtime
  • vendor (optional): Vendor are any entry-point(s) bundles such as those that contain your nodemodule libraries
  • components: Components are the webpack bundles that contain the universal components that were rendered on the server
  • main: Main is your application bundle (i.e. your redux code, your business logic, components which aren’t universal, etc)

The critical piece to this is that all code-split bundles generated by Webpack are wrapped in a function call like this:

webpackJsonp([37],{ /* your bundle code */ });

webpackJsonp is a function in the Webpack bootstrap code that when called will add the bundle code to webpack’s module cache. Subsequent require’s of that module will synchronously retrieve that module from the cache. Therefore by placing the component bundle script tags that webpack-flush-chunks flushChuck/flushFiles gives us before the main application bundle script tag, you ensure that on initial client render that those components are already in Webpack’s require cache and therefore render synchronously. By rendering synchronously on the initial client render, the initial client render will match what was rendered on the server, thereby preventing SSR/client checksum mismatches. Thereafter, components are loaded asynchronously on-demand via XHR. This is the solution to “tricky part” noted by the React Router team!

With react-universal-components and webpack-flush-chunks, our bootstrap, vendors, components (HeroCarousel, etc), and main app bundle. http2 is highly recommended for performance.

In React projects it is common to use CSS modules to scope CSS to a particular React component. With Webpack, CSS modules are typically paired with extract-text-webpack-plugin. This Webpack plugin looks at all of the imported CSS files across your React components and extracts them into a single CSS file. With react-universal-components however we are loading React components on demand, therefore it does not make sense that we would send the CSS for a component which we may never need. You could of course continue to use extract-text-webpack-plugin, but with extract-css-chunks-webpack-plugin, there is no need to. Since our React component is on-demand, it makes sense that the CSS associated with that React component is on-demand as well.

The magic that makes extract-css-chunks-webpack-plugin works is a bit challenging to explain, so let’s take it step by step. The requires us to introduce the last package that completes the Universal Components ecosystem — babel-plugin-universal-import.

We’ve covered a lot of ground in this article, and with babel-plugin-universal-components we can complete the picture. First let’s make note of that fact that this a babel plugin. If we view Babel documentation, we learn:

Babel is a compiler. At a high level, it has 3 stages that it runs code in: parsing, transforming, and generation (like many other compilers).

Babel plugins affect the 2nd stage, transformation. Therefore babel-plugin-universal-import is transforming code that you write into a different code. In particular, it does this:

import universal from 'react-universal-component'
const UniversalComponent = universal(import('./Foo.js'))

<UniversalComponent />

↓ ↓ ↓ ↓ ↓ ↓

import universal from 'react-universal-component'
import universalImport from 'babel-plugin-universal-import/universalImport.js'
import importCss from 'babel-plugin-universal-import/importCss.js'
import path from 'path'

const UniversalComponent = universal(universalImport({
chunkName: () => 'Foo',
path: () => path.join(
_dirname, './Foo.js'),
resolve: () => require.resolveWeak('./Foo.js'),
load: () => Promise.all([
import( /* webpackChunkName: 'Foo' */ './Foo.js'),
importCss('Foo')
]).then(proms => proms[0])
}))

<UniversalComponent />

babel-plugin-universal-import effectively turns react-universal-component’s universal function call in a macro that expands to code that handles the boilerplate of setting up a Universal Component with CSS. As you can see the transformed code calls react-universal-component’s universal() function with additional parameters, in particular one that downloads the CSS associated with the Webpack bundle containing the universal component at the same time as the component itself. With the universal component and its associated CSS arriving together, everything needed to load a component is on-demand.

With extract-css-chunks-webpack-plugin and babel-plugin-universal-import, CSS is also on-demand for components needed by our CMS driven layouts.

Universal Components are a powerful technique for creating on-demand isomorphic components, a feat which until recently was simply not possible. There use cases of on-demand isomorphic components are many — in Discovery’s case our layout was CMS driven and therefore the components need to render our UI were data-driven and unknown. Another general use case that benefits from Universal Components is A/B testing components. Having on-demand components makes your webpack bundle smaller, leading to significantly faster page load times and a better user experience.

As we learned in this article, while the benefits of Universal Components are significant, implementing Universal Components is rather involved. However the complexity of the technique offers a large advantage — it requires you to gain a deeper understanding of the way that Webpack and Babel work and how they can be used to create functionality that was previously impossible. As we mentioned in the beginning of the article, there is a new form of web applications taking shape in which tooling takes a primary role. By creating and using Webpack and Babel plugins to transform and assemble your code you can unlock new possibilities and novel solutions.

While this technique is complex, there is hope. As more people learn about this method and appreciate it’s power that the API and tools for creating and using these new universal components can be further improved and additional articles and tutorials written to ease the learning curve and increase adoption. Until then, we hope that this article as well as the ones linked to below help start you on your way. Feel free to comment and ask questions. We’re excited to share with you the tools, techniques, and challenges we face and overcome as a client engineering team at Discovery Digital Media.


Tag cloud