Sunday, July 1, 2012

Exploring C# LinkedLists via LRU Caches (3 of 3)

Downloads for post:
Long after I had deposited the LRU Cache (described in the previous post) into the library, I was poking around the guts of Dictionary<TKey, TValue> to get a better understanding of how it stored data internally. The implementation is pretty compact, and avoids creating a lot of extra objects for the contents of its internal collections.  When key/value pair is added, the HashCode of the key is mapped to a bucket, where the total number of buckets is the dictionary’s capacity. Each bucket is the head of a single linked list of all the entries added to the dictionary that mapped to the same bucked. All these single linked lists are contained in one big array of entry structs. Deletions are handled by maintaining a single linked list of free entry slots. The old LRU Cache I’d worked on before popped into my head, and it occurred to me that there might be some benefit in having it lean on a linked list that did not use objects for each node, but instead was implemented as an array of node structs.

You can’t just take a regular class implementation of a linked list, and simply replace “class” with “struct” in your node class. The reason is that classes use reference semantics, but structs use value semantics. What that means in practice is that a struct type variable is its own copy of the struct contents. If you say
SomeStruct a, b;
a = new SomeStruct();
a.SomeMember = 100;
b = a;
b.SomeMember = 50;

then a.SomeMember will be 100 and b.SomeMember will be 50. If you did the same thing with classes
SomeClass a, b;
a = new SomeClass();
a.SomeMember = 100;
b = a;
b.SomeMember = 50;

then a.SomeMember and b.SomeMember will both be 50. A struct variable holds a whole copy of a struct, not just a reference to it. If we simply change a node class to being a struct
struct LinkedNode {
    public string Value;
    public LinkedNode Previous;
    public LinkedNode Next;
}

then we make the compiler very grumpy. A LinkedNode variable would be a chunk of memory containig a string reference followed by a LinkedNode chunk of memory containing a string reference followed by a LinkedNode chunk of memory containing…ad infinitim. It just can’t work, and the compiler will let you know that. But there is another way.

What if we say that a LinkedNode struct always lives as an element in an array of LinkedNode structs, and that you refer to a node is by its array index? Then we can define a LinkedNode struct like this:
struct LinkedArrayNode {
    public string Value;
    public int PreviousIdx;
    public int NextIdx;
}

That will work - there's no problem making an array of those. Using this approach, I wrote a linked list class called LinkedArray that keeps its linked list as an array of structs, thus requiring no extra object creation for nodes. Then I wrote a third version of an LRU Cache called LruCacheSlim that uses a LinkedArray to store the cache contents.

LinkedArray


There are a few pieces that go into making LinkedArray work. There is the definition of the node struct which is just a value, the index of the previous node, and the index of the next node. The main piece is an array of these node structs. The size of the array is the LinkedArray’s capacity. Initially the capacity is all unused, so it will need to keep a count of how far into the array it has put active nodes. It needs to track the index of what is considered the head node (first in the linked list, although not necessarily the first position in the array). Once an active node has been placed in the array, I never want it to physically move within the array as a result of insertions or deletions. Block moving array elements will slow things down, and it’s important that nodes have completely stable locations since their index number is how they are referenced. So deletions will need to be handled by maintaining a free list of nodes that were active, but have been deleted. This list is a single linked list that acts like a stack. Finally, it needs to keep a count of items in the free list. When a new node is being inserted into the list, if the free list count is greater than 0 then the next free slot will be used and the free count decremented. Otherwise, the next virgin position in the unused capacity is used. The position of the next virgin position is the count of active nodes plus the free list count. The basic declaration of LinkedArray members looks like this:
public class LinkedArray<T> : ICollection<T>, IEnumerable<T>, IEnumerable {
    protected const int NullIdx = -1; // index value used to indicate null node (since nodes are structs, there is no true null node)
    protected const int FreeIdxMarker = -2;  // index value used in node's PreviousIdx to mark a node as a member of the free list

    private LinkedArrayNode[] nodes;// array of node structures
    private int head = NullIdx; // index of the head of the double linked list of active nodes (circular, so tail is previous to head, for easier maintenance)
    private bool circular;      // indicates whether linked list is presented to consumer as circular or not
    private int count;          // number of active nodes
    private int freeCount;      // number of nodes in the free list 
    private int freeIdx = FreeIdxMarker; // index of the head of single linked list of free nodes
    private int version;        // indicates modifications

    ... contstructors, properties, and methods ...
    
    private struct LinkedArrayNode {
        public T Value;
        public int PreviousIdx;
        public int NextIdx;

        internal LinkedArrayNode(T value) {
            this.Value = value;
            this.NextIdx = NullIdx;
            this.PreviousIdx = NullIdx;
        }
    }
}

The following diagram shows snapshots of how the LinkedArray looks immediately after the method to the right of that snapshot. These steps demonstrate how the LinkedArray works under the covers.



Granted, there is a some risk in using integer indexes to reference nodes rather than referencing objects. But the benefits may make it worth it. In particular, LinkedArray can be a useful component for building other collections, which is what I did with the LRU Cache.

LruCacheSlim


The implementation of LruCacheSlim is nearly the same as those described in the previous two posts, because from its viewpoint, LinkedArray is just a linked list, except instead of referencing node objects, it refers to node indexes. Rather than make this long post longer by pasting in the code, you can download it using the link at the top of this post.

In comparing LruCacheSlim with the two cache implementations in the previous two posts, how does it rate? For speed, it's somewhere in the middle between the previous two except for the operation of inserting items into a cache that hasn’t reached its capacity yet. In that case it displays a much more linear time profile, outperforming the other two versions handily where the cache capacity is very large (millions of items). Here are graphs comparing the three for the most common operations except for trying to get an item that doesn’t exist in the cache (where they are all nearly identical since all three just execute a lock and a Dictionary.TryGetValue).


This makes a few things apparent. First, they’re all pretty fast, and in the same ballpark. This is probably because most of the time they spend is from the lock{ } and Dictionary operations they all share in common. The first two versions that use objects for nodes have flatter lines for Adds where the new item replaces the oldest item (in a full cache), vs. Adds into a cache that hasn’t reached capacity. I think this is because once capacity has been reached, they stop creating new node objects, and after that just reuse the oldest node object that is getting dumped from the cache. For the very large cache sizes, the impact of creating all those node objects appears to increase the time cost of creating subsequent objects. But I’m speculating on that point.

Since they’re all pretty similar in speed, why bother using LruCacheSlim that uses LinkedArray? Besides speed, we should also consider memory. There is a cost associated with creating a bunch of node objects, but LinkedArray doesn’t suffer that cost. Because of that, LruCacheSlim consumes about 2/3 the overhead space of the other two for the same capacity. Another way of looking at that is if you feel like you can safely cache some number of Foo objects with LruCacheOrig or LruCache (from a memory consumption perspective), you could cache more Foos with LruCacheSlim. Being able to cache more items might substantially improve overall speed of the application using the cache, in a way not captured by how many nanoseconds faster one version’s operations are over another’s.