Understanding Memoization in JavaScript to Improve Performance

November 21, 2018 0 Comments

Understanding Memoization in JavaScript to Improve Performance

 

 

Cranking up the performance rate of our apps is what we crave. Memoization is one of the techniques in JavaScript to speed up the lookup of expensive operations by caching the results and re-using the cache in the next operation.

In this article, we will see the usage of memoization and how it could help optimize the performance rate of your apps. Let’s get started.

Tip: When building JS apps, it’s better to share and reuse code than to duplicate it. Using Bit you can easily share and manage components across projects, update changes and collaborate as a team to build faster. Try it :)

If we have a CPU intensive operation, we can optimize the usage by storing the result of the initial operation in the cache. If the operation is bound to be carried out again, we won’t go to the hassle of boring out our CPU again, since the result of the same result was stored somewhere, we just simply return the result.

It could be represented like this:

function longOp(arg) {
if( cache has operation result for arg) {
return the cache
}
else {
perform the long operation, takes 30 minutes
store the result in cache
}
return the result
}
longOp('lp') // performs the operation because it is the first time, it takes 30 minutes
// Next, it stores the result in cache
longOp('bp') // performs the operation because it is the first time for this set of inputs bp, it takes 30 minutes
// Next, it stores the result in cache
longOp('bp') // same operation
// It just returns the cache without perfomrming the long operation, takes 1 second
longOp('lp') // same operation
// It just returns the cache without perfomrming the long operation, takes 1 second

The above pseudo-function longOp is an expensive function in terms of CPU usage. First, it lets the operation run for a specific set of input(s) and caches the result of the operation. On subsequent calls with the same inputs, it just returns the result previously stored in the cache bypassing the time and resource consuming operation.

The first call with args lp, does the time-consuming operation which took 30 minutes to complete. When the function was called with the same input lp, it retrieves the previous result stored in the cache, this took approx. 1 second. The same happened with the input, bp.

Using real examples, let’s say we have a function that finds the square root of a number:

function sqrt(arg) {
return Math.sqrt(arg);
}
log(sqrt(4)) // 2
log(sqrt(9)) // 3

we can memoize this sqrt function:

function sqrt(arg) {
if (!sqrt.cache) {
sqrt.cache = {}
}
if (!sqrt.cache[arg]) {
return sqrt.cache[arg] = Math.sqrt(arg)
}
return sqrt.cache[arg]
}

We see now that the sqrt function save each result in a cache object that is a property of the function.

Note, when memoizing your functions it is ideal to first create your cache and check if the input is already in the cache, if not move on to perform the operation.

We can call the function a few times:

sqrt(9)
sqrt(9)
sqrt(4)

The first call with 9 as param isn't cached, the computation is conducted and the result is cached.

Ths second call returns the cached result.

The third call will perform the computation because the input is its first time. Computation will be performed and the result cached for the next invocation of the function with the same input.

We can add a log to see the objects stored in the cache:

//...
log(sqrt.cache) // { "9": 3, "4": 2 }

Another example, let’s say we have a square function that computes the square of any numbers passed into it.

function square(num) {
return num * num;
}
log(square(2)) // 4
log(square(4)) // 16

We can memoize this function like this:

function square(num) {
if(!square.cache) {
square.cache = {}
}
if(!square.cache[num]){
return square.cache[num] = (num * num)
}
return square.cache[num]
}

Same as sqrt function, it checks if the input has its result already. If yes, it returns the result from the cache. If no, it does the computation, returns the result and stores the result in the cache.

log(square.cache) // undefined, no cache initially
log(square(2)) // 4
log(square.cache) // { "2": 4 }, the result of input 2 now in cache
log(square(2)) // 4
// Simply, returns the result of the input 2 stored in the cache from the above operation.
log(square(4)) // 16, does the operation becuase no operation with input 4 has been carried out before
log(square.cache) // { "2":4, "4": 16}, now input 4 result is now in cache.
log(square(4)) // 16, you can guess it retrievs theresult from the cache

In the last section, we have altered our functions in order to add memoization to them.

Now, we could create a standalone function that will memoize any function. We will call this function memoize.

function memoize(fn) {
return function () {
var args =
Array.prototype.slice.call(arguments)
fn.cache = fn.cache || {};
return fn.cache[args] ? fn.cache[args] : (fn.cache[args] = fn.apply(this,args))
}
}

We see that this function accepts another function as an argument and returns a function.

To use this function, we call the memoize passing the function we want to memoize as an argument.

memoizedFunction = memoize(funtionToMemoize)
memoizedFunction(args)

For example, let’s pass in our first example,

function sqrt(arg) {
return Math.sqrt(arg);
}
const memoizedSqrt = memoize(sqrt)

The returned function memoizedSqrt is now a memoized version of sqrt.

Let’s try it:

//...
memoizedSqrt(4) // 2 calculated
memoizedSqrt(4) // 2 cached
memoizedSqrt(9) // 3 calculated
memoizedSqrt(9) // 3 cached
memoizedSqrt(25) // 5 calculated
memoizedSqrt(25) // 5 cached

We can add the memoize function to the Function prototype so that every function defined in our apps inherits the memoize function and can call it.

Function.prototype.memoize = function() {
var self = this
return function () {
var args =
Array.prototype.slice.call(arguments)
self.cache = self.cache || {};
return self.cache[args] ? self.cache[args] : (self.cache[args] = self(args))
}
}

We know that all functions defined in JS are inherited from the Function.prototype. So, anything thing added to the Function.prototype will be available to all the functions we define.

Let’s see it in action:

function sqrt(arg) {
return Math.sqrt(arg);
}
// ...
const memoizedSqrt = sqrt.memoize()
log(memoizedSqrt(4)) // 2, calculated
log(memoizedSqrt(4)) // 2, returns result from cache
log(memoizedSqrt(9)) // 3, calculated
log(memoizedSqrt(9)) // 3, returns result from cache
log(memoizedSqrt(25)) // 5, calculated
log(memoizedSqrt(25)) // 5, returns result from cache

The goal of memoization is speed. Memoization trades memory space for speed. In an analysis conducted by Ericsson ConsumerLabs, it was found out that waiting for a slow web page to load is comparable to watching a horror movie.

With, memoization we speed up the performance of our apps.

To show the difference in time taken to execute a non-memoized function and a memoized. We will benchmark our functions utilizing tie reading methods provided to us by Node.

I benchmarked the sqrt function:

function sqrt(arg) {
return Math.sqrt(arg);
}
const memoizedSqrt = memoize(sqrt)
console.time("non-memoized call")
console.log(memoizedSqrt(4))
console.timeEnd("non-memoized call")
console.time("memoized call")
console.log(memoizedSqrt(4))
console.timeEnd("memoized call")
$ node memoize
2
non-memoized call: 8.958ms
2
memoized call: 1.298ms

I saw that the non-memoized function sqrt ran slower than the memoized function. We all know the reason, the memoized function doesn’t compute for the same input already cached, it returns the previously stored result.

I ran the tests so many times on my machine and as always the memoized function always beat the non-memoized function:

Admin@PHILIPSZDAVIDO MINGW64 /c/wamp/www/developerse/projects/trash
$ node memoize
2
non-memoized call: 8.215ms
2
memoized call: 0.566ms
Admin@PHILIPSZDAVIDO MINGW64 /c/wamp/www/developerse/projects/trash
$ node memoize
2
non-memoized call: 7.062ms
2
memoized call: 0.354ms
Admin@PHILIPSZDAVIDO MINGW64 /c/wamp/www/developerse/projects/trash
$ node memoize
2
non-memoized call: 7.300ms
2
memoized call: 0.303ms
Admin@PHILIPSZDAVIDO MINGW64 /c/wamp/www/developerse/projects/trash
$ node memoize
2
non-memoized call: 7.681ms
2
memoized call: 0.301ms
Admin@PHILIPSZDAVIDO MINGW64 /c/wamp/www/developerse/projects/trash
$ node memoize
2
non-memoized call: 7.170ms
2
memoized call: 1.225ms
Admin@PHILIPSZDAVIDO MINGW64 /c/wamp/www/developerse/projects/trash
$

Next, I benchmarked the square function:

function square(num) {
return num * num;
}
const memoizedSqr = memoize(square)
console.time("non-memoized call")
console.log(memoizedSqr(2))
console.timeEnd("non-memoized call")
console.time("memoized call")
console.log(memoizedSqr(2))
console.timeEnd("memoized call")

I got the following results:

$ node memoize
4
non-memoized call: 9.455ms
4
memoized call: 4.567ms

As expected the memoized function out-performed the non-memoized function.

Running more tests:

Admin@PHILIPSZDAVIDO MINGW64 /c/wamp/www/developerse/projects/trash
$ node memoize
4
non-memoized call: 7.167ms
4
memoized call: 0.294ms
Admin@PHILIPSZDAVIDO MINGW64 /c/wamp/www/developerse/projects/trash
$ node memoize
4
non-memoized call: 7.444ms
4
memoized call: 0.562ms
Admin@PHILIPSZDAVIDO MINGW64 /c/wamp/www/developerse/projects/trash
$ node memoize
4
non-memoized call: 7.472ms
4
memoized call: 1.487ms
Admin@PHILIPSZDAVIDO MINGW64 /c/wamp/www/developerse/projects/trash
$ node memoize
4
non-memoized call: 7.002ms
4
memoized call: 0.790ms
Admin@PHILIPSZDAVIDO MINGW64 /c/wamp/www/developerse/projects/trash
$ node memoize
4
non-memoized call: 7.841ms
4
memoized call: 1.788ms
Admin@PHILIPSZDAVIDO MINGW64 /c/wamp/www/developerse/projects/trash
$

Here it comes, memoization generally reduces the execution time and impacts the performance rate of our apps. memoization works best when we know that a set of inputs will produce a certain output.

Following best practices, memoization should be implemented on pure functions. Reasons that pure functions produce an output which depends on the input without changing the program's state (side effects).

As memoization trades space for speed, memoization should be used in functions that have a limited input range so as to aid faster checkups.

Memoization works best when dealing with recursive functions, which are used to perform heavy operations like GUI rendering, Sprite and animations physics, etc.

Memoization should not be used when the output of the function isn’t dependent on the input and when the output changes over time.

Let’s see this example:

function unpredicatble(input) {
return input * Date.now()
}
log(unpredicatble(4))
log(unpredicatble(4))
$ node memoize
6169843078712
Admin@PHILIPSZDAVIDO MINGW64 /c/wamp/www/developerse/projects/trash
$ node memoize
6169843091424

You see, the function was called with the same parameter yet it produced a different result/output. The output changes over time and it isn’t ideal for memoization, as the memoization won’t catch the current output.

Fibonacci is one of many complex algorithms that can be optimized using memoization.

In case, you lost me there. Fibonacci series is that is characterized by the fact that every number after the first two is the sum of the two preceding ones.

1,1,2,3,5,8,13,21,34,55,89

Each number is the sum of the previous two.

F2 = F1 + F1 => (2 = 1+1)
F5 = F3 + F2 => (5 = 3+2)
Therefore,
Fn = Fn-1 + Fn-2

We can now implement this in JS:

function fibonacci(num) {
if (num 1 || num 2) {
return 1
}
return fibonacci(num-1) + fibonacci(num-2)
}

This function is recursive if the num exceeds 2. It recursively calls itself with a decrement.

log(fibonacci(4)) // 3

Let’s benchmark the effectiveness of running the fibonacci against a memoized version.

const memFib = memoize(fibonacci)
log('profiling tests for fibonacci')
console.time("non-memoized call")
log(memFib(6))
console.timeEnd("non-memoized call")
console.time("memoized call")
log(memFib(6))
console.timeEnd("memoized call")
$ node memoize
profiling tests for fibonacci
8
non-memoized call: 2.152ms
8
memoized call: 1.270ms

The first call took 2.152ms, then next call hit the cache which took 1.270ms. That’s a difference of 0.882ms!!! So much time.

Running more tests:

$ node memoize
profiling tests for fibonacci
8
non-memoized call: 3.031ms
8
memoized call: 1.136ms
Admin@PHILIPSZDAVIDO MINGW64 /c/wamp/www/developerse/projects/trash
$ node memoize
profiling tests for fibonacci
8
non-memoized call: 1.444ms
8
memoized call: 0.249ms
Admin@PHILIPSZDAVIDO MINGW64 /c/wamp/www/developerse/projects/trash
$ node memoize
profiling tests for fibonacci
8
non-memoized call: 1.200ms
8
memoized call: 0.242ms

You see at some point it took the memoized version almost 0ms to execute.

The factorial of a number is gotten by multiplying the number in sequence from the number down to 1.

5! = 5 * 4 * 3 * 2 * 1
such that,
N! = N * (N-1) * (N-2) * (N-3) * ... * 2 * 1

It can be implemented in JS like this:

function factorial(num) {
if(num 1 || num 0) {
return 1
}
return num * factorial(num-1)
}

Factorial is a recursive-bound algorithm and it can become expensive to call multiple times.

Let’s say we want to calculate the factorial of 20.

log(factorial(20)) // 2432902008176640000

That’s huge!! And it took:

console.time("factorial(20) call")
log(factorial(20))
console.timeEnd("factorial(20) call")
factorial(20) call: 1.932ms

Factorial is repeating a number of steps.

factorial(20)

is the same as

factorial(20) = 20 * factorial(19)
20! = 20.19!
N! = N.(N-1)!

If we could cache the results of the previous calculations, we would speed up the execution time of our function. Instead of finding factorial(19) every time, it returns the result of the previous calculation from the cache.

We memoize the factorial function, then benchmark it to see the performance:

const memFact = memoize(factorial)
log('profiling tests for factorial')
console.time("non-memoized factorial(20) call")
log(memFact(20))
console.timeEnd("non-memoized factorial(20) call")
console.time("memoized factorial(20) call")
log(memFact(20))
console.timeEnd("memoized factorial(20) call")
$ node memoize
profiling tests for factorial
2432902008176640000
non-memoized factorial(20) call: 2.956ms
2432902008176640000
memoized factorial(20) call: 2.084ms

The memoized factorial fares better.

In this post, we learned what memoization does and how it could affect the performance of our web apps.

Next, we went on to create several functions and a memoized version of them. Then, we benchmarked them and saw that the memoized version outperformed the non-memoized version. Finally, we explored several use cases to show how memoization could have a huge impact on them.

Be it as it may, memoization has a great benefit but it comes with a cost. It trades heavy memory usage for speed. It might go un-noticed in low-memory functions but would have an impact when dealing with RAM-intensive functions.

If you have any question regarding this or anything I should add, correct or remove, feel free to comment, email or DM me.


Tag cloud