Goodbye Transform-Streams, long live ES9 Async Generators

December 07, 2018 0 Comments

Goodbye Transform-Streams, long live ES9 Async Generators



Clickbait title. Async generators are a powerful news, but you cannot throw transform streams in the trash.

Node.js doc says: “a transform stream is a duplex stream where the output is computed in some way from the input. Examples include zlib streams or crypto streams that compress, encrypt, or decrypt data”.

I’m not going to explain them here, you can find lot of free resources online. They are a really powerful resource to build amazing things with Node.js, but the drawback is that they are not so simple to implement not even for easy stuff. Anyway if you need to handle only the transform logic of the chunks odds are that this type of streams is the best.

Transform streams are independent units in our code, they could be directly piped with other streams, taking all the advantages of the case. Backpressure will be automatically applied, so we do not have to worry about different time needs between the readable stream and the writable one. Node.js will optimize these processes, making them concurrent.

Async generators are a special ES9 news that let us to easily implement the asynchronous iterator interface. Node.js streams have already implemented it, allowing us to do like the following:

So if a stream is readable it is also an asynchronous iterable.

This is not a tutorial on async stuff, if you want to know more about async iterables and async generators check this out!

They can be used to transform chunks like transform streams, but their main purpose is to handle the data flow.

In short words async generators let you:

Now time is completely under your control. You have not to query data from the source as soon as they are available. You have not to write data into a write stream as soon as you receive them from the source. Of course you must wait each time you query data and each time you write them, because read and write are asynchronous operations.

The point is that what to do between read and write is up to you and it can be asynchronous. It would be a lie if I told you that you cannot do something similar with transform streams. But with async generators things are easier and you can let the asynchronous data flow be managed by other entities in your code. This is why we can consider async generators a flow control tool.

Let’s see an example where we pipe two generator for filtering out all even numbers and increment by one odd ones, before writing them to process.stdout. Numbers are asynchronously coming from a misterious readable stream:

In this example there is no explicit flow control, because each async generator simply consume all chunks as soon as possible. Into the IIAFE things don’t change due to the third for-await-of. But the async iterable returned by the async generators piping could have been used differently, giving us the opportunity to let other entities manage the whole asynchronous iteration.

Don’t worry. I’m working on a little module that let you write functions, also async ones, and then trasform them into async generators! Furthermore I’ve coded a little helper function to easily pipe async generators together.

Let’s see how to use the asyncGeneratorsFactory :

You can see that filter functions must return undefined only when a chunk has to be discarded. Normal transform functions must always return a value. Composition functions are not yet supported, but I’m working on them.

Now let’s see how to pipe the generators:

It’s great, isn’t it?

Here you can find a complete gits.

Here you can see the source code of my helpers.

I’ve tested async generators vs transform streams with files. With small and medium size files you cannot see lot of differences: sometimes transform streams are better, sometimes async generators are better.

If you need to use my asyncGeneratorsFactory helper function know that things will go a bit slower.

Anyway with big size files seems that async generators are better also for transformation purposes only. But more tests are needed also with different types of streams. If you can, help me :)

Tag cloud