I’ll sketch the code for the root of the PersistentHashMap structure in this post. The focus is on the data structures used and the primary operations: adding a key/value pair, find the value associated with a key, and removing a key. We’ll start with the Clojure interfaces that must be implemented.

Refresher

You can refer to the first post in this series for the background on HAMTs.

Interfaces

To fit into the Clojure eco-system, we will need to implement the following interfaces. We discussed these in detail in This map is the territory.

[<AllowNullLiteral>]
type Seqable =
    abstract seq: unit -> ISeq

and [<AllowNullLiteral>] IPersistentCollection =
    inherit Seqable
    abstract count: unit -> int
    abstract cons: obj -> IPersistentCollection
    abstract empty: unit -> IPersistentCollection
    abstract equiv: obj -> bool

and [<AllowNullLiteral>] ISeq =
    inherit IPersistentCollection
    abstract first: unit -> obj
    abstract next: unit -> ISeq
    abstract more: unit -> ISeq
    abstract cons: obj -> ISeq

[<AllowNullLiteral>]
type ILookup =
    abstract valAt: key: obj -> obj
    abstract valAt: key: obj * notFound: obj -> obj

[<AllowNullLiteral>]
type IMapEntry =
    abstract key: unit -> obj
    abstract value: unit -> obj

[<AllowNullLiteral>]
type Associative =
    inherit IPersistentCollection
    inherit ILookup
    abstract containsKey: key: obj -> bool
    abstract entryAt: key: obj -> IMapEntry
    abstract assoc: key: obj * value: obj -> Associative

[<AllowNullLiteral>]
type Sequential =
    interface
    end

[<AllowNullLiteral>]
type Counted =
    abstract count: unit -> int

[<AllowNullLiteral>]
type IPersistentMap =
    inherit Associative
    inherit IEnumerable<IMapEntry> // do we really need this?
    inherit Counted
    abstract assoc: key: obj * value: obj -> IPersistentMap
    abstract assocEx: key: obj * value: obj -> IPersistentMap
    abstract without: key: obj -> IPersistentMap
    abstract cons: o: obj -> IPersistentMap

At the root

The code below is fairly direct translation of the original C# code, a direct translation of the original Java code from Clojure(JVM). There are ways in which the code might not be considered idiomatic F#. For example, the code uses null values as flags instead of using option types. Deal with it.

Actually, after getting this code running, I re-implemented the code at least three times to explore other choices. It was … interesting. Mostly, small variations in efficiency and readability. I thought I’d just stick with the code that’s been heavily tested in the wild.

There is one immediate consequence of using null as a flag: the internal nodes that contain keys cannot handle null itself as a key. So the root data node holds a flag and the value for the null key, if required.

 PersistentHashMap private (meta: IPersistentMap, 
                            count: int, 
                            root: INode, 
                            hasNull: bool, 
                            nullValue: obj) =
    inherit APersistentMap()

The PersistentHashMap class represents the map itelf. It holds any associated metadata (needed for the IObj and IMeta interface implementations). It holds the count of entries so we can implement Counted (for an efficient count method). The hasNull flag indicates if null is present as a key in the map; the nullValue field holds the associated value.

The root field contains a pointer to the index node at the top of the tree. There are three types of nodes in the tree: BitmapIndexedNode, ArrayNode, and HashCollisionNode; each implements the INode interface. In general, operations on the map are performed on the root and generally defer to the root node to do the heavy lifting – those are the operations defined in INode.

The base class, APersistentHashMap provides some standardized implementations for things that are common to most maps and that we will not concern ourselves with here. For example, implementations of map equality, of IDictionary, etc.

We provide some convenient accessors for these fields:

    member internal _.Meta = meta
    member internal _.Count = count
    member internal _.Root = root
    member internal _.HasNull = hasNull
    member internal _.NullValue = nullValue

Note that root can be null; this will be the case if the map is empty or has only the null key as an entry. Thus we can define our default ‘empty’ map via:

    static member Empty = PersistentHashMap(null, 0, null, false, null)

At this point, many interface operations are simple. Either they can be done directly or we delegate them to the root.

    interface IMeta with
        override _.meta() = meta

    interface IObj with
        override this.withMeta(m) =
            if LanguagePrimitives.PhysicalEquality m meta then
                this
            else
                PersistentHashMap(m, count, root, hasNull, nullValue)
    
    interface Counted with
        override _.count() = count

    interface IPersistentCollection with
        override _.count() = count
        override _.empty() =
            (PersistentHashMap.Empty :> IObj).withMeta (meta) :?> IPersistentCollection     
            // empty passes the metadata along   

We start getting a bit more in the weeds as we get into detailed operations:

    interface ILookup with
        override this.valAt(k) = (this :> ILookup).valAt (k, null)

        override _.valAt(k, nf) =
            if isNull k then
                if hasNull then nullValue else nf
            elif isNull root then
                null
            else
                root.find (0, NodeOps.hash (k), k, nf)

You see in the second valAt our special handling of a possible null key and the null key value.
If the root is null, there are no non-null entries are so nothing to find. else we defer the lookup to the root node. The find method takes the level in the tree (0 at the start), the hash value for the key, the key, and the ‘not found’ value to return if the key is not in the map.

The Associative inteface is the more map-specific version of ILookup. It deals with key presence and returning IMapEntry objects (key/value pairs) instead of just the values. But the form is similar to the what we just saw.

    interface Associative with
        override _.containsKey(k) =
            if isNull k then
                hasNull
            else
                (not (isNull root))
                && root.find (0, NodeOps.hash (k), k, PersistentHashMap.notFoundValue)
                   <> PersistentHashMap.notFoundValue

        override _.entryAt(k) =
            if isNull k then
                if hasNull then
                    upcast MapEntry.create (null, nullValue)
                else
                    null
            elif isNull root then
                null
            else
                root.find (0, NodeOps.hash (k), k)

You will note a fairly standard trick: we use a special value, notFoundValue, to indicate that a key was not found. Thisis passed down to the nodes below. If it comes back, no entry for the key was found.

Other than the lookup functionality defined above, the map operations of interest as assoc, adding a new key/value pair, and without, removing a key. There are in the IPersistentMap interface. Starting with assoc:

    interface IPersistentMap with
        override this.assoc(k, v) =
            if isNull k then
                if hasNull && LanguagePrimitives.PhysicalEquality v nullValue then
                    upcast this
                else
                    upcast PersistentHashMap(meta, (if hasNull then count else count + 1), root, true, v)

We being with special handling for a null key. If present, and the value matches, there is no change: just return this. Otherwise, we create new map with the null key/value pair. The count will increase or not depending on whether the null key was already present. Continuing:

            else
                let addedLeaf = BoolBox()

                let rootToUse: INode = if isNull root then upcast BitmapIndexedNode.Empty else root

                let newRoot = rootToUse.assoc (0, NodeOps.hash (k), k, v, addedLeaf)

                if LanguagePrimitives.PhysicalEquality newRoot root then
                    upcast this
                else
                    upcast
                        PersistentHashMap(
                            meta,
                            (if addedLeaf.isSet then count + 1 else count),
                            newRoot,
                            hasNull,
                            nullValue
                        )

This is already a bit in the weeds. We have to have a root to defer the operation to. If there is a value for root, we use. Otherwise we create an empty node of the appropriate type (BitmapIndexedNode – much more on this coming below). We defer the assoc operation to the root node. If the root node comes back unchanged, then the key/value pair was already present and the assoc operation was a no-op – we can return this. If the root node is different, then we have a new root node and we create a new map with that root node. The count of entries in the map will increase if the key was not already present. The special object addedLeaf is a sentinel that will be set if a new entry is added to the map. The BoolBox type is a simple mutable boolean type.


type BoolBox(init) =

    let mutable value: bool = init

    new() = BoolBox(false)

    member _.set() = value <- true
    member _.reset() = value <- false
    member _.isSet = value
    member _.isNotSet = not value

(There are other ways to handle this. I just used what the C#/Java versions used.)

The assocEx operation is an assoc that throws an exception if the key is present.

        override this.assocEx(k, v) =
            if (this :> Associative).containsKey (k) then
                raise <| InvalidOperationException("Key already present")

            (this :> IPersistentMap).assoc (k, v)

Finally, the without operation:

        override this.without(k) =
            if isNull k then
                if hasNull then
                    upcast PersistentHashMap(meta, count - 1, root, false, null)
                else
                    upcast this
            elif isNull root then
                upcast this
            else
                let newRoot = root.without (0, NodeOps.hash (k), k)

                if LanguagePrimitives.PhysicalEquality newRoot root then
                    upcast this
                else
                    upcast PersistentHashMap(meta, count - 1, newRoot, hasNull, nullValue)

There is the usual special case handling for the null key. When we do the operation on the root, getting back the same root indicates the key was not present, so removing it was a no-op. Otherwise, we have a new root and reduced count.

In the next post, we look at the structure of the INode interface and the three node types that implement it.