August 16, 2016

Array, Promise and Maybe monads. Plus Docker is a functor.

Big thank you to Vincent Orr and especially to Luis Atencio for reading an early draft of this post and providing generous feedback.

There is a new excellent explanation of the Maybe monad written by James Sinclair. I highly recommend reading his long and comprehensive essay. I just want to give a slightly different reason for having monads around.

Simple function for primitives

Whenever I code, I always prefer to code the simplest functionality first. For example an addition function would just return the sum of its two arguments

What if I want to log the input arguments before adding them? I do not add the logging into the function add - this adds a second concern to the function, making it harder to understand, test and reuse.

We can refactor the logArguments further to separate the log call from applying the given function, but the main point is this:

We kept the function add as is; add has a very clear purpose and works on its simple arguments.

Let us take a look at another simple function, for example double

Again, double only handles the simplest case possible - a single input argument. If we wanted to extend double with additional features, like logging its arguments, we would create a second function via composition.

Working with arrays

What if we wanted to double many numbers? Given a list of numbers, could we double every element in the list by reusing the original double that operates on a single item at a time?

map

The function double stays the same, but it was adapted to work on a list of numbers using another function we wrote. What if we wanted to log each number before doubling it? Just compose double, logArguments and map functions!

What is so interesting about the map function we have just written? Its purpose is not really to adapt the double function per se. It does not extend its functionality the way logArguments does. No, the map function adapts the function double to work with a different type of inputs - it "teaches" double how to work with multiple numbers stored in an Array.

In functional jargon we say that map "lifts" function double to work on a non primitive type.

We can write "type signature" next to each function for clarity. Even if the original function double works with any argument due to JavaScript's dynamic nature, let us pretend that we are only passing numbers to it.

Function map adapts any function that works on a simple single argument to also work on an entire list of values. It is so useful and common that it became a built-in method in the ES5 JavaScript standard, and it belongs to the Array type. Thus our program should really be written simply as

The Array is so convenient and is used so often, that sometimes we run into troubles. For example, what if the function we are mapping does not return a primitive value, but ( drum roll please) an already wrapped value?

Consider for example a function that returns all distinct characters in a given string.

What happens when we want to find distinct characters across a list of words? Well, we are going to map over an Array, right?

Hmm, this does not work very well, because the function distinct is a little more complicated than double. Their signatures show this

When we map over distinct we put back into the Array more little Arrays instead of primitive types (that should be string in this case). The assumption that map makes is that the function it adapts takes the primitive value and returns a primitive unwrapped value.

We could of course "wrap" the value returned by the join ourselves, then we could call map if we wanted.

flatMap

Almost correct, we really want to handle the case when the mapped function returns an already wrapped value! Let us write a function that is just likemap function, but knows how to NOT doubly wrap the result. Because in this case its purpose is to avoid nested arrays, we need to flatten the result, and thus we will call it flatMap - it maps the value and then flattens the returned wrapped result to avoid wrapping it twice.

The above function calls the simple fn that MUST return an Array. The Array is the wrapped value in this case, and we prevent putting an Array into an Array by replacing the internal list with the new result.

Here is how we use it

It is almost like re-gifting a present. You do not just place the wrapped unwanted present into a new gift bag and give it to someone else. You first unwrap and discard the wrapping paper with the personal greeting card, wrap the present in your paper and add a new greeting card.

Combining map with flatMap

We can chain both map and flatMap method calls, depending on the return type of the lifted (adapted) function. For example we can flat map strings in an Array to letters and then map each letter to upper case.

The map vs flatMap choice should naturally depend on the function used. For example we could use flatMap to double numbers, but it would be a lot more efficient and intuitive to use simple map

Working with async code

The above examples are all synchronous. The result is available right away, that is why we could print the returned value. The Arrays are very suitable for working with synchronous lifted functions; but they break down when the items are generated asynchronously, like reading values from a database or network resource.

Promises are abstractions that allow us to work efficiently with values that will be generated asynchronously. They are part of the ES6 standard and are widely available in browsers.

Promises are monads! They wrap a value that will be there in the future, and allow modifying the value using a "map" operation, called then. In fact Promises are even more user friendly than the above Array monad because both map and flatMap is just a single method then. The Promise.prototype.then is "smart" to determine how it should treat the value returned from the callback function.

If the callback function returns a non-Promise instance, it will be treated like map, placing the value back into the Promise "wrapper".

If the passed function returns a Promise object, it knows NOT to double wrap it and then method acts like flatMap instead

The anonymous function in the middle has returned a Promise object (with value 100 inside), but what got printed next was just the primitive value 100, not [Promise{100}] because then acted like flatMap in this case.

The method flatMap (and then for Promises) is nice, but it only flattens the returned value of the same type. For example, if we resolve a Promise with an Array, it will keep the Array, since Promise only understands how to flatten other Promises, not other monad container types.

Similarly, we can map or flatMap an Array of Promises, but we still will get the Promise objects inside.

If we want to really get the primitive value from a monad container inside another monad container we have to map with custom code. For example, Promise API has a method to convert an Array of Promises into an Array of resolved primitive values.

The call Promise.all in the first line takes as an argument an Array monad with each individual item being a Promise monad. It returns a single Promise monad that contains an Array monad. This switcheroo gives us the actual values.

Note: this Maybe monad is slightly different from the typical implementation. It is only supposed to be an example how to safely perform numerical division.

The Maybe monad is useful for dealing with non-existent values without a pyramid of if - else blocks. Take a look at the typical divide function.

Without guarding for the zero value of the second argument, we can get quite a large number! We could add the guard logic into the function itself, but this goes against our principle - add new features to the existing function by composing functions, not by putting more code inside of them.

Notice we had to hard code the action to take if the second argument was zero, in this case returning 'nope' string (or maybe throwing an exception).

We can handle the above situation differently. Let us wrap the data in a new data type for storing just two numbers

Using keyword new is a pain so we can just add a utility method of to make creating a Maybe object of two numbers easier

We have an existing function divide(a, b) that we want to reuse safely.

The function divide operates on the primitive values and returns a primitive, thus it is a good candidate for map method. We will add map method to the Maybe object. We will place the guard logic there!

We can then safely use the division - the result, if there is one will be in the a property. If a is "null" then the division was invalid because the second number was zero.

What about flatMap? We need to handle a case when the function returns an instance of Maybe.

We can use it on a somewhat artificial example

Notice that flatMap cannot be interchanged with map - if the given function returns already wrapped value we need to use flatMap to get the value to avoid double wrapping. If we forget this rule and use map the result will be a nested Maybe inside another one.

Getting the result out of the above Maybe monad is kind of awkward. Thus most Maybe implementations provide convenience methods. These methods also help avoid hard coding the logic in the original simple function. Let us add a method that gets the computed value or returns the default value if the division was invalid

Libraries like Ramda Fantasy and Folktale provide good implementations of Maybe and other monads with lots of convenience methods.

Docker is a functor

If monads wrap a value and have both map and flatMap methods, then how do we call types that only have the map method? They are called functors; a standard ES5 Array is an example. The "functors" only "safely" use the given callback function, giving it a wrapped value and placing the primitive result back into the container without any thinking.

Sometimes a very unusual structure turns out to be a functor. Docker images are built from simple text files. Each Dockerfile specifies a base image and then each command (like install a software module) creates a new derived image. Here is a Dockerfile that installs gulp in a Node 6 base image

We can take this file and run docker build command.

The above build step downloaded an image "mhart/alpine-node:6", which contains the NodeJS binary, ran the command npm install -g gulp inside that "environment" and created new image with id "f3158cbb038c".

Conceptually, this looks a lot like our monads! The Docker image wraps a value (in the example a NodeJS binary, then a NodeJS binary and "gulp" installed). Each statement RUN <command> is like a JavaScript function. The command docker build plays the map method role - it allows a simple "dumb" command npm install -g gulp to actually run on the value inside the image and then places the value back into the Docker image.

(I am avoiding using the word "container" because Docker actually has a specific meaning for this term - a Docker container is an image being executed).

So the above program is almost like the following JavaScript

We can repeat the process multiple times. Another Dockerfile can use the produced image and run ("map") a simple command that will operate inside the image.

which is equivalent (combined with the previous step) to:

Why Docker is not a monad

In order for Docker to be a "monad" it would have to support flatMap. A callback function for flatMap could return a new Docker image and the original container would have to know how to switch to it. That is, the returned value would be a Docker image that would become the new base image.

Since there is no Docker functionality like this (there is Docker in Docker, but it does not support replacing the original image with the image returned by the build command), we can only express what it would be like in fake JavaScript notation

Thus the Docker system is not a full monad, but only functor - it only knows how to map shell commands.

To read more about functors, I have a couple of other blog posts:

I also have a Docker basics covered in a single gist.

Conclusion

Monads are a way to reuse simple functions. Monads wrap a primitive value that the simple function expects. The wrapping logic could be used to iterate over items in a list, handle asynchronous values or guard against values that would break the simple functions.

Docker is a functor - it only supports "map" method, but not the "flatMap" method.