Dataweave allows programmers to use the *reduce* operation in their scripts.

In fact there are two different kinds of reduce operation, depending on whether the accumulator of the operation is initialised or is left uninitialised.

This operation can also be challenging to use unless one first masters its subtleties.

In this blog post we shall:

- Explain how reduce with an initialised accumulator and an uninitialised accumulator works
- What results to expect from these two operations when they are used in their general form.
- Clarify that reduce with an initialised accumulator is equivalent to the
*foldl*operation in functional programming languages. - Clarify that reduce with an uninitialised accumulator is equivalent to the
*foldl1*operation in functional programming languages.

The last two points are especially important because some confusion exists over the nature of the reduce operation. In particular the following Mulesoft knowledge base incorrectly states that reduce is a *foldr* operation. By showing that reduce is equivalent to foldl/foldl1 we eliminate all ambiguity and provide a model for this operation in terms of well understood operations from the functional programming paradigm.

We start from the reduce operation with an initialised accumulator.

We consider the reduce operation with an initialised accumulator in its general form.

To initialise the accumulator, we need to provide reduce with a lambda expression with two parameters, the first is called the value (val), and the second is called the accumulator (acc) and is initialised to some value m.

The body of the lambda expression simply calls some function f of two parameters during each reduction step executed by reduce. The function f takes acc as its first parameter and val as its second parameter, and returns a result by performing some operation on these parameters. We assume that f takes acc as its first parameter and val as its second parameter to simplify our presentation later on.

`reduce (val,acc=m) -> f(acc,val)`

We now consider how the above reduce operation works.

Suppose that we run the operation on an empty list:

`[] reduce (val,acc=m) -> f(acc,val)`

Then the reduce function returns the value of the accumulator, which is m.

Now suppose that we execute the operation on the finite list of values [x1, x2, x3]:

`[x1,x2,x3] reduce (val,acc=m) -> f(acc,val)`

Stepping through the computation we have the following:

- During its first step, reduce initialises the accumulator acc with the value m, and loads the first element x1 of the list into val. It then uses f to combine the value of acc and val by computing f(m,x1).
- During its second step, reduce loads the value of the accumulated computation f(m,x1) into acc and the next element x2 of the list into val. It then uses f to combine the value of acc and val by computing f(f(m,x1),x2).
- During its third step, reduce loads the value of the accumulated computation f(f(m,x1),x2) into acc and the next element x3 of the list into val. It then uses f to combine the value of acc and val by computing f(f(f(m,x1),x2),x3).
- During its final step, reduce notes that the list has been exhausted, so the last value to be computed is loaded into the accumulator and returned, giving f(f(f(m,x1),x2),x3) as a result.

We now explain how the reduce operation with an uninitialised accumulator is in fact a foldl operation.

As our reference definition of foldl, we shall use the foldl function from the Haskell programming language.

The foldl function takes a function f, a value e and a list x:xs as parameters.

`foldl f e [] = e`

foldl f e x:xs = foldl f (f e x) xs

If foldl is given an empty list, it returns the value e.

If foldl is given a non-empty list, it combines the parameter e with the first element of the list x by computing f e x, and then recursively calls foldl again passing (f e x) as its second parameter and the rest of the list xs as its third parameter.

For instance if we run foldl on the list [x1,x2,x3] with function f and m as the value of the formal parameter e:

`foldl f m [x1,x2,x3]`

We obtain the following sequence of computations:

foldl f m [x1,x2,x3]

foldl f f(m x1) [x2,x3]

foldl f f(f(m x1)`x2) [x3]`

foldl f f(f(f(m x1) x2) x3) []

f(f(f(m x1)x2) x3)

It is easy to see that the formal parameter e of the foldl function acts as an accumulator:

- When foldl is given a non-empty list, the formal parameter e allows the passage of the computation which has been accumulated by the inner foldl calls to the outer foldl call.
- When foldl is applied to the empty list, the value of the formal parameter e is returned, which in this case corresponds to the total value accumulated during the course computation.

Note that running:

`foldl f m [x1,x2,x3]`

Has given us the value f(f(f(m x1) x2) x3), as with the call to reduce.

This means that reduce with an initialised accumulator is equivalent to the function foldl.

We now consider the reduce operation with an uninitialised accumulator in its general form.

`reduce (val,acc) -> f(acc,val)`

Suppose that we execute the above reduce function on an empty list:

`[] reduce (val,acc) -> f(acc,val)`

Then the reduce function will return the value null, which denotes the fact that the reduce operation has not been called correctly, as there was nothing to reduce in the list.

On the other hand, suppose that we execute the above reduce function on a list with a single element [x1]:

`[x1] reduce (val,acc) -> f(acc,val)`

Since the accumulator has not been initialised, the reduce function will load the first element x1 of the list into the accumulator. It then returns the value in the accumulator, because there are no further items in the list.

Finally, suppose that we run reduce on the finite list of elements [x1,x2,x3].

`[x1,x2,x3] reduce (val,acc) -> f(acc,val)`

Stepping through the computation we have the following:

- During its first step, reduce loads the uninitialised accumulator acc with the first element x1 of the list, and loads the next element of the list x2 into val. It will then use f to combine the value of acc and val by computing f(x1,x2).
- During its second step, reduce loads the value of the accumulated computation f(x1,x2) into acc and the next item x3 in the list into val. It then uses f to combine the value of acc and val by computing f(x1,x2),x3).
- During its final step, reduce notes that the list has been exhausted, so the last value to be computed is loaded into the accumulator and returned, giving f(f(x1,x2),x3) as a result.

We now show that the reduce operation with an uninitialised accumulator is in fact a foldl1 operation.

As our reference definition of foldl1, we use the foldl1 function from the Haskell programming language.

The function foldl1 takes a function f and a list x:xs as parameters.

`foldl1 f x:xs = foldl f x xs`

If foldl1 is given an empty list, it returns an error.

If foldl1 is given a list with one element [x1] as a parameter, it evaluates to: foldl f x1 [ ], with the only element of the list being loaded into the accumulator. As we have seen previously, it then evaluates to x1.

If foldl1 is given a non-empty list [x1,x2,x3] as a parameter, it evaluates to foldl f x1 [x2,x3], with the first element of the list being loaded into the accumulator. As we have seen previously, this evaluates to f(…f (f x1 x2) x3)…xn).

Hence we have shown that reduce with an uninitialised accumulator is equivalent to a foldl1 operation.

I hope you have enjoyed this post; please comment if you found it interesting or have any queries or input.

Ricston Ltd.

Triq G.F. Agius De Soldanis,

Birkirkara, BKR 4850,

Malta

Triq G.F. Agius De Soldanis,

Birkirkara, BKR 4850,

Malta

MT: +356 2133 4457

UK: +44 (0)2071935107

info@ricston.com

Need Help?

Ask our Experts!