Developers often want to split computations into several separate stages. The smaller the stage, the easier it is to reason, develop and maintain. For example we split some computation into 3 stages in functions f, g, k with resulting one is
input => f(g(k(input))) or using Ramda
R.comp(f,g,k) or any other library with function composition operator.
The problem with this approach is intermediate data passed between functions, and each sub-stage should finish its computation completely before passing the result to next stage. The data size they operate with may be large or even infinite if it is some server requests stream. In a case of unlimited data
k will never return control. As it is often occurring task, there are many solutions, like nodejs streams with their
.pipe() operation adding stream transformer to the chain.
The worse approach would be to pass a single value between the stages and mutate it. It is very difficult to reason about shared data mutation especially if it is some recursive structure, like programming language abstract syntax tree.
Transducers described in this post may be seen as a simpler solution to the problem, working simultaneously, no intermediate data, no data mutations.
Transducers are easy to compose. In fact they are just functions and function composition is just enough, the expressions above (
input => f(g(k(input))) and
R.comp(f,g,k)) are the same for transducers. The resulting transducer is a pipeline of computations receiving data from producer and passing it to consumer. Producer and consumer may do many things, read/write network data, file, DB, or just in-memory array.
Clojure transducers type from the original blog post is:
There is an earlier paper with the example of transducers transforming producers instead of consumers: Lazy v. Yield: Incremental, Linear Pretty-printing in Haskell. And data types there are:
To see Consumer there is a reducer from Clojure substitute
State e a = s -> m (a, s) into Consumer definition:
Consumer (State whatever) input
= input -> State whatever ()
= input -> whatever -> ((), whatever)
= whatever, input -> whatever
Array.from function stores result in in-memory Array. The approach will work even if the sequence is infinite.
Transducers take input Producer (Iterator) along with other optional parameters and return another Producer-iterator with another computation stacked on top of it.
A typical pattern is:
map function applying a function to each element is:
Or filter, passing further only elements satisfying some predicate:
next of an iterator in an asynchronous callback, but not
Recently EMCAScript has got async generators and
for await-of syntax extension for this. Everything in this story works for async generators too, except
for-of is replaced by
for await-of . There is a more detailed case study for async generators as transducers in “Decouple Business Logic Using Async Generators” article.