Demystifying HTTP Caching

May 07, 2018 0 Comments

Demystifying HTTP Caching

 

 

“A macro shot of a Macbook Pro screen with a few dozen lines of code” by Luca Bravo on Unsplash

HTTP caching is a powerful method which yields huge performance benefits, saves bandwidth, reduces server traffic, etc. only if it is done right. Bitter truth is that most of the developers are giving very less importance to caching which results in race conditions resulting in interdependent resources getting out of sync. I am a freelance JavaScript Developer and I’ll be sharing few best practices based on my experience.

1 . The Cache-Control general-header field is used to specify directives for caching mechanisms in both requests and responses. Caching directives are unidirectional. I will only list most commonly used directives. The full list can be found here.

  • no-store
  • no-cache
  • max-age=<seconds>
  • must-revalidate

2 . The Pragma HTTP/1.0 general header is used for backwards compatibility with HTTP/1.0 caches where the Cache-Control HTTP/1.1 header is not yet present. It includes only one directive no-cache.

We will be dealing mostly with the above ones and I will be explaining these as we get deeper into patterns. Most of patterns found on the internet falls into one of the following.

To turn off caching, send the following response header.

Cache-Control: no-cache, no-store, must-revalidate

This instructs the browser or an intermediate caching server not to store any static files.

no-store means do not store particular resource from the server anywhere (i.e browser or proxy caching ).

no-cache doesn’t mean “don’t cache”, it means it must revalidate with the server before using the cached resource.

must-revalidate doesn't mean "must revalidate", it means the local resource can be used if it's younger than the provided max-age, otherwise it must revalidate.

max-age Specifies the maximum amount of time a resource will be considered fresh. This directive is relative to the time of the request whereas Expires, specifies the expiration in GMT.

Cache-Control: max-age=31536000

The content at this URL never changes then the browser/caching servers can cache this resource for a 365 days without a problem. Cached content younger than max-age seconds can be used without consulting the server.

<img src="/cats-v1.jpg" alt="…">
<script src="/script-v2.1.0.js"></script>
<link rel="stylesheet" href="/styles-b137cbdf.css">

Each URL contains something that changes along with its content. It could be a version number, the modified date, or a hash of the content or anything that differentiate them with the others.

If the URL changes then it denotes a different resource. The previous resource will be removed by the browser automatically when it expires. Developers will be using gulp, grunt task runners for revisioning the static resources for each release. This is commonly called as cache-busting.

Cache-Control: no-cache

In this pattern, browser doesn’t believe the local resource state and always validates with server to check its freshness.

<img src="/cats.jpg" alt="…">
<script src="/script.js"></script>
<link rel="stylesheet" href="/styles.css">

Here you can add an ETag or Last-Modified date header to the response. Next time the client fetches the resource, it echoes the value for the content it already has via If-None-Match and If-Modified-Since respectively, allowing the server to send "HTTP 304". Otherwise the server sends the full content.

This pattern always involves a network call when compared to Pattern 2. So Pattern 2 is quite better than this one.

Cache-Control: must-revalidate, max-age=600

This pattern appears as good one at first but the trap is you cannot burst the cache before 10 minutes. This is because must-revalidate will revalidate the resource only after max-age expires.

The service worker & the HTTP cache goes hand in hand until we screw it up.

const version = '2';  
self.addEventListener('install', event => {
event.waitUntil(caches.open(static-${version})
.then(cache => cache.addAll([
'/',
'/script-s93bc42a.js',
'/img-a837cfgh.png',
'/styles-0e9a7ef0.css'])
)
);
});

Here we could cache the root page using pattern 3, and the rest of the resources using pattern 2. Each service worker update will trigger a request for the root page, but the rest of the resources will only be downloaded if their URL has changed. We can see that service workers work best as an enhancement rather than a workaround. You can learn more about service worker here.


Tag cloud