You can be a boss at reducers if you know this one weird trick!

At least at one point I had hoped that was true. It turns out that getting reducers right requires thinking it through every single time you are confronted with new one. But I think we can come up with enough guidance so that after a few examples, we won’t really need to look at the reducers in collections to come; you’ll be able to understand them and verify them yourself.

## Background

If you look for ‘clojure reduce’ in your search engine of choice, you might run across Reducers. Reducers are a very useful suite of functions that should definitely be in the arsenal of any Clojure programmer, but we have a simpler aim: clojure.core/reduce.

There are two forms of `reduce`, with two and three parameters. The two-parameter form takes a reducing function (of two arguments) that is applied first to the first two items of the collection, then to the result of that invocation and the third item, that result and the fourth item, etc., until the end of the collection is reached and the accumulated result is returned.

``````(reduce '+ [1 2 3 4 5] ; =>15
``````

Effectively it computes `(((1+2)+3)+4+5)`.

The three-paremeter version supplies a starting value that is supplied to the reducing function along with the first item, and so on.

``````(reduce '+ 10 [1 2 3 4 5] ; =>25
``````

Effectivly it computes `((((10+1)+2)+3)+4+5)`.

Here is the contract for `reduce`:

f should be a function of 2 arguments. If val is not supplied, returns the result of applying f to the first 2 items in coll, then applying f to that result and the 3rd item, etc. If coll contains no items, f must accept no arguments as well, and reduce returns the result of calling f with no arguments. If coll has only 1 item, it is returned and f is not called. If val is supplied, returns the result of applying f to val and the first item in coll, then applying f to that result and the 2nd item, etc. If coll contains no items, returns val and f is not called.

The code for `reduce` in `core.clj` is indirect:

``````(defn reduce
([f coll]
(if (instance? clojure.lang.IReduce coll)
(.reduce ^clojure.lang.IReduce coll f)
(clojure.core.protocols/coll-reduce coll f)))
([f val coll]
(if (instance? clojure.lang.IReduceInit coll)
(.reduce ^clojure.lang.IReduceInit coll f val)
(clojure.core.protocols/coll-reduce coll f val))))
``````

A collection implements the `IReduce` and/or `IReduceInit` interfaces to provide a specialized, presumably more efficient, reduction algorithm. Otherwise, the magic of protocols is used to extend reduction to types that do not have these interfaces defined. That code is in `clojure.core.protocols`. The protocols aspect is not our focus here; we are interested in how to implement the interfaces in our collections.

## The interfaces

And here they are:

``````[<AllowNullLiteral>]
type IReduceInit =
abstract reduce: IFn * obj -> obj

[<AllowNullLiteral>]
type IReduce =
inherit IReduceInit
abstract reduce: IFn -> obj
``````

`IReduceInit.reduce` take a function and start argument. `IReduce.reduce` just takes the reducing function. The missing argument is the collection itself, which is `this`.

## Feeling reduced, but not diminished

Before we get to code, we need to have a little chat about `Reduced`. It figures significantly in the code we are about to write.

It is hard to find information about `Reduced`. I checked five books on Clojure and found nary a mention. The most prominent mention of `Reduced` is in the reference on Transducers.

`Reduced` is used to stop reductions early. From the Transducers article:

Clojure has a mechanism for specifying early termination of a reduce:

A process that uses transducers must check for and stop when the step function returns a reduced value (more on that in Creating Transducible Processes). Additionally, a transducer step function that uses a nested reduce must check for and convey reduced values when they are encountered.

A reduced value is literally an object of type `Reduced`. It just wraps a value, making it available through the `deref` method of the `IDeref` interface:

``````type IDeref =
abstract deref: unit -> obj

[<Sealed>]
type Reduced(value) =
interface IDeref with
member _.deref() = value
``````

You can read the Transducers article for reasons for using this. For one thing, it is the only way to run a reduction over an infinite collection – you have to send a signal that you’ve had enough.

One essential rule when writing a `reduce` method (for `IReduce` and `IReduceInit`): after each invocation of the reduction function, check the result to see if it is an instance of `Reduced`; if so, stop immediately and return the `deref` value.

Note: if you are actually writing transducers, you might need to be passing back the `Reduced` object itself. This is not our concern. Our rule is only for `IReduceInit.reduce` and `IReduce.reduce`.

## Some code

I’m going to start with some of the original C# code (essentially identical to the Java code) to see the problems we have with making a translation to F#. One of the simplest comes from `PersistentList`:

``````public object reduce(IFn f)
{
object ret = first();
for (ISeq s = next(); s != null; s = s.next()) {
ret = f.invoke(ret, s.first());
if (RT.isReduced(ret))
return ((IDeref)ret).deref();
}
return ret;
}

public object reduce(IFn f, object start)
{
object ret = f.invoke(start, first());
for (ISeq s = next(); s != null; s = s.next()) {
if (RT.isReduced(ret))
return ((IDeref)ret).deref();
ret = f.invoke(ret, s.first());
}
if (RT.isReduced(ret))
return ((IDeref)ret).deref();
return ret;
}
``````

The contract for `reduce` makes these demands:

1. Without a start value:

a. if the collection has no items, return the result of calling `f` with no arguments

b. if the collection has only one item, return that item (`f` is not called)

c. the first application of f should be to the first and second items in the collection

d. if a call to `f` results in a `Reduced` instance, dereference it and return that value.

2. With a start value:

a. if the collection has no items, return the start value (`f` is not called)

b. the first call to `f` should be on the start value and the first item

c. if a call to `f` results in a `Reduced` instance, dereference it and return that value.

It might appear that requirements (1a) and (2a) are violated in the `PersistentList` code. And then you remember that `PersistentList` always has at least one member so that no check for emptiness is required. You should verify that the other conditions are met.

You can’t take that C# code and just copy it into F#. It relies on early returns out of loops, which we don’t have in F#. And we’d probably prefer to avoid mutating bindings. The technique often used is to translate to a recurive function that does the looping, which is essentially the same in our examples as using a `recur` loop in Clojure.

For example, take our first `reduce` above:

``````        object ret = first();
for (ISeq s = next(); s != null; s = s.next()) {
ret = f.invoke(ret, s.first());
if (RT.isReduced(ret))
return ((IDeref)ret).deref();
}
return ret;
``````

Two things change from iteration to iteration: the values of `ret` and `s`; just look for the assignments to those variables. Those become our parameters. Regular exit is when `s = null` – we negate the condition of loop continuation to get the condition for method termination. Early exit is done by checking for `Reduced`. Thus our loop can be encoded by

``````let rec step (ret:obj) (s:ISeq) =
if isNull s then
ret
else
match f.invoke(ret,s.first()) with
| :? Reduced as red -> (red:>IDeref).deref()
| nextRet -> step nextRet (s.next())
``````

The iteration is started by calling `step` on arguments that set up the correct initial values for `ret` and `s`:

``````step (this:>ISeq).first() ((this:>ISeq).next())
``````

The other `reduce` is similar

``````interface ReduceInit with
member this.reduce(f,init) =
let rec step (ret:obj) (s:ISeq) =
if isNull s then
ret
else
match ret with
| :? Reduced as red -> (red:>IDeref).deref()
| _ -> step (f.invoke(ret,s.first())) (s.next())
let ret = step (f.invoke(start,(this:>ISeq).first())) (this:>ISeq>.next())
match ret with
| :? Reduced as red -> (red:>IDeref).deref()
|_ -> ret
``````

If you look closely, there is distinct difference between the two, both in the original and in the translation. For the first one, in the loop, we call `f` and check its value. For the second one, we check the value from the previous iteration, then call `f` to generate a value to pass for the next iteration. If one writes the start-value version in C# this way:

``````public object reduce(IFn f, object start)
{
object ret = f.invoke(start, first());
if (RT.isReduced(ret))
return ((IDeref)ret).deref();

for (ISeq s = next(); s != null; s = s.next()) {
ret = f.invoke(ret, s.first());
if (RT.isReduced(ret))
return ((IDeref)ret).deref();
}
return ret;
}
``````

the loop body is now the same here as in the first one. Translating ths into F#, the two versions now have identical `step` functions. You can move that into a method, leading to this code:

``````member this.recurser(acc:obj, s:ISeq) =
if isNull s then
ret
else
match f.invoke(ret,s.first()) with
| :? Reduced as red -> (red:>IDeref).deref()
| nextRet -> this.recurser(nextRet, (s.next()))

interface IReduce with
member this.reduce(f) =
let asSeq = this:>ISeq
this.recurser(asSeq.first(),asSeq.next())

interface IReduceInit with
member this.reduce(f,start) =
let asSeq = this:> ISeq
match f.invoke(start,asSeq.first()) with
| :? Reduced as red -> (red:>IDeref).deref()
| acc -> this.recurser(acc,asSeq.next())
``````

Because the start-value version does a call of `f(start,first())` before we get into the loop, we must make sure to check it for `Reduced` before looping.

If you check carefully against our requirements, you will find that they are all met. Do not neglect to do this exercise for every reduce you write. Trust me.

## Cycling

Let’s do one more. There is a `cycle` function in Clojure that “[r]eturns a lazy (infinite!) sequence of repetitions of the items in coll.” It just calls a factory method on a the `Create` class.

``````(cycle [1 2 3] ) ;=> (1 2 3 1 2 3 1 2 3 ...)
``````

A simple implementation of `Cycle` would hold the original sequence on the side so we could start over at the beginning if we have run through all the elements. It then just needs to know the ‘current’ sequence. Calling `first()` on the `Cycle` would just call `first()` on the ‘current’ sequence. Calling `next()` on the `Cycle`, we’d call `next()` on the underlying sequence and make that result the ‘current’ sequence in a new `Cycle` object.

The actual implementation of `Cycle` works a little harder in order to more efficient, by being lazy about calling `next` on the underlying sequence. One does not really need to know the `next()` on the underlying sequence until you either call `first()` or `next()` on the cycle object. At that point you can compute `next()`. We will need a mutable field in our `Cycle` to save the ‘current’ sequence when we finally get around to computing it. This will not be visible from the outside, so `Cycle` is immutable to outward appearance.

It’s probably easier just to look at the code.

``````type Cycle private (meta:IPersistentMap, all:ISeq, prev:ISeq, c:ISeq, n:ISeq) =
inherit ASeq(meta)

[<VolatileField>]
let mutable current : ISeq = c   // lazily realized

[<VolatileField>]
let mutable next : ISeq = n  // cached

private new(all,prev,current) = Cycle(null,all,prev,current,null)

static member create(vals:ISeq) : ISeq =
if isNull vals then
PersistentList.Empty
else
Cycle(vals,null,vals)

member this.Current() =
if isNull current then
let c = prev.next()
current <- if isNull c then all else c

current

interface ISeq with
override this.first() = this.Current().first()
override this.next() =
if isNull next then
next <- Cycle(all,this.Current(),null)

next
``````

A couple of small details. If `Cycle.create(s)` is called with an empty sequence, we return an empty list, not a `Cycle`. If we have `Cycle` object in our hand, we are guaranteed that its base sequence is not empty. Note that both `first()` and `next()` access the ‘current’ sequence through a call to `Current`; that method takes care of noticing if the underlying field `current` is occupied – `null` indicates we haven’t done the work of calling `next` on the underlying sequence yet. When you access `Current`, it will do that computation and save the result. This code also handles cycling back to the beginning if we have reached the end. It’s pretty slick. (Note: the cleverness is in the Java code. I didn’t come up with it. Authorship note in that file credits Alex Miller. Little tricks to promote laziness pop up all over the place.)

On to `reduce`. We will need to advance through the underlying sequence to access successive items. We do not need to use `Cycle.next()` to do this – that would create a bunch of unnecessary `Cycle` items. We just need to compute on the underlying sequence directly, performing the action that is done in `Cycle.Current()`. The following method does this.

``````member this.advance(s: ISeq) =
match s.next () with
| null -> all         // we've hit the end, cycle back to the beginning
| x -> x
``````

Consider the no-start-value version of `reduce`. We always have items, no need to check. The sequence is infinite, so there is no end condition from the sequence. The only way out is to get a `Reduce` back from `f`. I wrote down the sequence of steps and looked for a loop point.

``````    acc <- first
s <- advance from current (because we have just eaten the first element)
Loop:
newAcc = f.invoke(acc, s.first())
check newAcc for Reduced -> leave
``````

How does the start-value version compare?

``````    acc <- start-value
s <- current
Loop:
newAcc = f.invoke(acc, s.first())
check newAcc for Reduced -> leave
``````

The loop is the same, other than how get started. Verify that the conditions (1c), (1d), (2b), and (2c) are met. (The others don’t matter.) And this goes straight to code.

``````member this.reducer(f: IFn, startVal: obj, startSeq: ISeq) =
let rec step (acc: obj) (s: ISeq) =
match f.invoke (acc, s.first ()) with
| :? Reduced as red -> (red :> IDeref).deref ()
| nextAcc -> step nextAcc (this.advance s)

step startVal startSeq

interface IReduce with
member this.reduce(f) =
let s = this.Current()
this.reducer (f, s.first (), this.advance (s))

interface IReduceInit with
member this.reduce(f, v) = this.reducer (f, v, this.Current())
``````

## Side note

The only way to test `Cycle.reduce` is to have an `IFn` that at some point returns a `Reduced` object. The magic of F#’s object expressions comes in handy here. We can create an object directly that implements `IFn`. However, don’t try to do this with an object expression based on `IFn` directly – you’d have to have an entry for each of the almost-20 `invoke` methods. Instead, you can base your object expression on `AFn`, an abstract class that has default implementations (raising `NotImplementedException’) for all of them. Here is an extract of my test code (using Expecto for writing the tests):

``````let adderStopsShort n =
{ new AFn() with
member this.ToString() = ""
interface IFn with
member this.invoke(x, y) =
if Numbers.gte(y ,n:>obj) then Reduced(x) else Numbers.add(x,y) }

let iter = Cycle.create(LongRange.create(100)) :?> IReduce
Expect.equal (iter.reduce ((adderStopsShort 10),100L)) 145L "Add them up for a while, with a kicker"
``````

Our cycle is based on a 100-element `LongRange`. `addStopsShort` called on a stop value yields an `IFn` with this behavior: When the second argument reaches the stop value, it returns the current value of the accumulator wrapped in a `Reduced`; otherwise, it is just `+`.

(The override of `ToString` in the object expression is necesary. It seems you can’t just override an inteface only.)

And with that, let’s quit.

## Behind the scenes

What I’ve not talked about is all the machinery behind the `CollReduce` protocol. That all lies out in the Clojure source code and is not our present concern. Mostly. I did have to dig into it to solve one problem. There is a `reduce` method in `ArrayChunk`. That actually is the reduce method for the `IChunk` interface. (See previous post Laziness and Chunking The `reduce` method in `ArrayChunk` does stop early when it gets a `Reduced` object back from the reducer function, but it returns the `Reduced` object, not the wrapped value. I struggled with this for a while until finally getting set on the correct track by Alex Miller over in the #clr-dev channel in the Clojurian slack. First is to note that this `reduce` is for `IChunk`. Then you have to figure out where it gets called from. And that’s where the protocol comes in. `reduce` will through the `CollReduce` protocol, which in this case will end up going through the `InternalReduce` protocol, wherein we find a handler for `IChunkedSeq`:

``````  clojure.lang.IChunkedSeq
(internal-reduce
[s f val]
(if-let [s (seq s)]
(if (chunked-seq? s)
(let [ret (.reduce (chunk-first s) f val)]
(if (reduced? ret)
@ret
(recur (chunk-next s)
f
ret)))
(interface-or-naive-reduce s f val))
val))
``````

It is this handler that calls `Chunk.reduce`. It notes the returned `Reduced` value, stops the iteration, and does the `deref`. If `ArrayChunk` did the `deref`, this handler woudn’t know to stop.

If you want to get a sense of the history of reduce, reducers, and transducers, check out the Clojure change log. These things take time to develop. Changes sometimes work through the code slowly. `clojure.lang.Reduced` was introduced in 2012 and incorporated into some of the `reduce` methods at that time. (Here is the commit.) But other edits came later. For example, it was two years later that `IReduceInit` was split off from `IReduce` (this commit) and checking for a `Reduce`‘d value was added to `PersistentList.reduce()` (this commit).