PersistentHashMap, part 4 -- Other matters
We continue our discussion of PersistentHashMap
with a discussion of transiency and alternative F# coding techniques.
The previous posts in this series are:
Transiency
I refer you to the discussion of transiency in the PersistentVector
series.
As for PersistentVector
, PersistentHashMap
supports interfaces such as IEditableCollection
, ITransientCollection
, ITransientAssociative
. There is one additional, map-specific transiency inteface:
[<AllowNullLiteral>]
type ITransientMap =
inherit ITransientAssociative
inherit Counted
abstract assoc: key: obj * value: obj -> ITransientMap
abstract without: key: obj -> ITransientMap
abstract persistent: unit -> IPersistentMap
. The same mechanisms are used to implement transiency in PersistentHashMap
as in PersistentVector
. I leave the implementation to you as an exercise.
Alternative coding techniques
The code described in these posts is a direct translation of the C# code, itself directly translated from the Java code in Clojure(JVM).
I decided to experiment with idiomatic F# techiques, with an eye toward code clarity and seeing the impact on performance.
Options
We have special handling of of null
as a key because null
is used as a signal in BitmapIndexedNode
to discriminate between an key-value entry and a node entry. Also, empty slots in an ArrayNode
are signaled by null
. We could use Option
types to handle these cases. This means also that INode
and its implementating classes no longer need to be AllowNullLiteral
. We do manage to get rid of the special handling of null
at the top, in PersistentHashMap
itself. Some if
expressions can be replace by match
, etc. But there is not much gain.
Discriminated unions
We could use discriminated unions to represent the different types of nodes. I did this before I worked on transiency. It was … okay. Not a big help. The big problem comes with transiency: as implemented it requires some mutable fields (counts, bitmaps) and discriminated unions cannot have mutable fields. I went so far as to put ref
fields in the DUs – you can do that – but the resulting code was really not pretty.
One place I did try discriminated unions was in the BitmapIndexedNode
. The existing implementation uses an array with even entries being null
or a (non-null
) key and the odd entries being correspondingly the node the next level down or the key’s value. I replaced this with a DU with two cases, one for a key-value pair and one for a node. The array is now half the size, with entries of type
BNodeEntry =
| BKey of Key:obj
| BSubNode of Node:INode
In one version I had a separate array to contain the key values. I also tried
| BKey of Key:obj * Value:obj
| BSubNode of Node:INode
This works fine until you try to implement transiency. Here the problem is that transiency ends up creating a few more entries in the array than currently required, to some of the entries are empty. You either end up with a BNodeEntry option array
– the pattern matching gets just nasty with this one – or you end up with
BNodeEntry =
| KeyValue of Key: obj * Value: obj
| Node of Node: INode2
| EmptyEntry
I ran out of ideas. Sometimes the code was a little easier to look at, but generally the overall complexity of the algorithms in things like BitmapIndexedNode.assoc
overwhelmed any gains. Nothing seemed to improve performance.
Given that each variation was causing me to rework hundreds of lines of code, I also ran out of energy. I decided to stick with the original implementation.
Onward
We conclude this series of posts with a look at performance and a comparison of the F# and Clojure versions of PersistentHashMap
.