Thursday, April 29, 2010

How to write a simple yet “bullet-proof” object cache

…continued from a previous post, by Eduardo Rodrigues

As promised, in this post, I’ll explain how we solved the 2nd part of the heap memory exhaustion problem described in my previous post: the skin resources cache. This part was much trickier to fix because we couldn’t simply get rid of the custom cache as we did with the resources bundle. We needed to keep the skins cache for 2 reasons: the cached objects were not instances of java.util.ResourceBundle but a String containing a CSS and they had to be indexed by a specific combination of variables as cache keys, which include skin base name, client locale and user agent. As already mentioned before, what causes the heap exhaustion are the various different possible combinations of the components used to compose the cache keys so the challenge here was to find a safe and efficient way of dealing with any possible amount of entries in our skins cache and also ensuring its size won’t ever cause an OutOfMemoryException that would ultimately crash the JVM.

(I know this will be a rather complicated post but I’ll do my best to keep it as simple as possible.)

The very first and most obvious approach that came to mind was to implement our own “HashMap” using weak or soft references, available since JDK 1.2. In a tiny nutshell, objects of these classes wrap the access to a specified target object and they receive a special treatment from the garbage collector (GC). If the only “live” references – meaning references that are valid and currently in use, therefore are not orphans – to an object come from instances of either java.lang.ref.WeakReference or java.lang.ref.SoftReference then such objects, including the Weak/SoftReference objects referencing them, are eligible for discard even if the Weak/SoftReference objects themselves are “alive”. The main difference between a weak and a soft reference is that the former allows the GC to discard them as soon as possible while the latter causes the GC to keep them alive as long as heap space available is not critical yet.

The reason why we didn’t initially think of using class java.util.WeakHashMap is obviously due to the fact that it uses a weak reference as the map’s key and we needed it to be as the map’s value. Another detail is that we wanted to have 2 implementations in hand, one using weak and the other using soft references. That’s because we had some concerns regarding how each solution would affect the GC performance on a heavily loaded and concurrent productions environment. Actually, our concern was more with regard to the soft references. We feared that the leniency of the GC with soft references would potentially cause such objects to quickly accumulate in the heap until the critical point where the GC would have to discard them. Since the amount of soft references to be discarded tends increase fast, this could have a substantial impact on the overall cost for the GC to do its job.

So, we started implementing our first idea, which basically consisted in writing classes WeakCache and SoftCache, both implementing interface java.util.Map and using an internal java.util.HashMap<String, java.lang.ref.WeakReference or java.lang.ref.SoftReference> as backing storage. With that, we also needed to associate each new value reference with its own java.lang.ref.ReferenceQueue since this would apparently be the only way of keeping track of the target objects being discarded by the GC and, thus making it possible for us to remove the corresponding map entry as well. Hopefully, with that we’d ultimately achieve the same functionality offered by java.util.WeakHashMap but adding a soft reference flavor to it and moving the auto-discardable references from the map’s keys to its values.

Looks good. But not good enough.

There are two big issues with the approach described above:
  1. A WeakReference can be too volatile.

    In theory, even to the point where it can render our WeakCache almost useless. The main reason for this is the fact that, in our particular case, we were dealing with a web application where the view layer is practically 100% implemented as an AJAX client and the skin resources being cached are only used in the mid-tier (where the cache exists) during the user login flow, which is processed in one single request. Therefore, we can pretty much assume that once a login request is over, unless another one that works with the same just-cached skin resource is currently being processed on the same JVM, there won’t be any "hard” references left at all to that object.

    If you are somewhat familiar with Java, you’ll probably agree with me that one of the most certain things about the Garbage Collector is actually its uncertainty. By definition, we can never guarantee when the GC will kick in, much less how it will behave or which objects will be actually discarded during its cycle, therefore, no code should ever rely on any particular assumption regarding any behavioral aspect of the GC. On top of that, if we look closely into the Javadoc for java.lang.ref.WeakReference we’ll also notice that there are no guarantees whatsoever on the exact moment when disposable weakly-referenced objects will in fact be discarded by the GC. In our case, in a worse case scenario, this unpredictability pretty much means that a new skin resource object that’s just been added to our WeakCache during a login request could be immediately discarded by the GC soon after the request was processed by the mid-tier. In other words, depending on the rate at which new login requests would reach the container (or JVM) where the cache lives as well as on the skin resources they would be working with, our WeakCache could tend to be empty most of the time. And that wouldn’t be a very useful cache :)

    The way we solved this “minor set-back”  was by adding a configurable structure to our WeakCache, which we called a “hard cache”. The idea is very simple: we just added an internal linked list and an optional argument called “hardCacheSize”. If not needed, then specifying zero as the hardCacheSize (or not specifying it at all) would basically turn the internal hard cache off. But if the hard cache is needed (and it was in our case), then a greater than zero hardCacheSize should be specified in which case, every time a skin resource is added to the cache as a weakly-referenced value in its internal HashMap or when a skin resource is found and fetched directly from the cache, a normal (“hard”) reference to the very same instance of the skin resource is also appended to the internal LinkedList until its size reaches the defined hardCacheSize at which point the first (oldest) node in the list will simply be removed to make room for the one being appended (newest), thus limiting the list’s maximum size so that it will never overflow the hardCacheSize defined. The intention here is to add a simple LRU (Least Recently Used) feature to our cache. The more an object is “hit” in the cache, the more hard references to it will exist in the LinkedList and, because its a linked list, the order in which each hard reference was added is maintained, thus ensuring that its first node will always point to the oldest reference and its last node to the newest. Having at least one hard reference to a cached object coming from the internal hard cache should be enough to prevent the high volatility issue described here.

    If you’re asking why haven’t I mentioned the SoftCache here, it’s not because I forgot. Due to the nature of soft references, at least in theory, they are much less vulnerable to the issue just described.

  2. Efficient monitoring of associated reference queues can be tricky and costly.

    As I mentioned before, the Javadocs for package java.lang.ref you’ll see that the recommended way to keep track of which weakly or softly-referenced objects have been disposed by the GC is to associate a java.lang.ref.ReferenceQueue instance with the references. We have two distinct ways to work with this queue:

    2.1. keep polling the queue in a controlled loop to check whether a new discarded reference is available;
    2.2. “listen” to the queue by using blocking method remove([long timeout]).

    Either way, if we want to keep our cache’s availability high, the smartest way of implementing a reference queue handler in our cache would probably involve the use of background threads or, since we’re talking about a J2EE application using Java 5.0, we should then make use of some executor service available in package java.util.concurrent.

    Needless to say, this certainly represents an overhead both in terms of code writing and complexity.
So, while we were thinking hard to come up with the most efficient, generic and elegant way of finally implementing our weak and soft caches, Mr. Eric Chan, who is one of the main architects in Oracle Beehive team, had a very interesting breakthrough. In short terms, he thought of a very nice way of combining both WeakReference and SoftReference in our weak and soft caches so that they would provide exactly the same functionality without having to deal with those reference queues at all. Basically, instead of using a plain HashMap as our backing storage, we used a java.util.WeakHashMap in both our cache implementations. The hat trick was what and how to store things in it.

Let’s start with the easiest one…

The WeakCache

This one was simple. We just replaced the backing HashMap<Key,Value> with a WeakHashMap<Key,Value> and then we used that internal “hard cache” I mentioned above as an implicit way of controlling which entries in the internal map are still in use and which ones can be disposed. The trick here was to add hard references to the map entries’ keys in the “hard cache” instead of to the values. So, if  there aren’t any hard references to a map’s key neither outside the WeakCache nor in the cache’s internal linked list, this naturally implies that the corresponding map entry in the WeakHashMap is available for disposal. To make it easier to understand, look at the diagram bellow, summarizing how the WeakCache works:

  WeakCache Diagram

Now let’s go on to the trickier one…

The SoftCache

This one was the trickiest. We also replaced the internal HashMap with a WeakHashMap, however the mapped values are SoftReference objects. But instead of simply pointing them directly to the target object being added to the cache, they point to a wrapper class called ValueHolder. The trick here was to not only hold the map entries’ value objects but also their corresponding key objects. And what goes into the internal “hard cache” in this case are hard references to these ValueHolder instances. Therefore, because the internal “hard cache” is the only place where hard references to the softly referenced ValueHolder objects mapped in the internal WeakHashMap are being held and those ValueHolder objects hold hard references to their corresponding map keys, ultimately, we can assume that, provided no external hard references to a key K inside the map exist, the map entry corresponding to that key K will be kept in the map as long as there’s at least 1 instance of ValueHolder containing key K exists in the internal “hard cache”. Otherwise, the weakly referenced key K will be available for disposal and so will be its corresponding map entry.

SoftCache Diagram

Explaining the internal “hard cache”

This is merely an auxiliary structure. Its content does not represent the cache’s actual content, which will always be identical to that of the internal WeakHashMap. This, by the way, is a sine qua non condition. Other then that, the following rules apply:
  1. when a new object is added to the cache a new reference to it is also immediately appended to the “hard cache”;
  2. when an object which exists in the cache is requested, a new reference to this object is appended to the “hard cache”;
  3. when the amount of elements in the “hard cache” reaches the maximum limit defined for it, then, immediately before any addition of a new reference to the “hard cache”, the oldest one (least recent) will be removed from it in order to open space to the new one (most recent);
  4. the size limit that is imposed to the “hard cache” does not apply to the cache itself but uniquely and exclusively to the “hard cache”.
From that, we can extract this corollaries:
The “hard cache” contains only “hard” (or normal) references to objects that exist in the cache, therefore are contained in its internal WeakHashMap, in the natural and sequential order by which those references were added. Hence, the smaller the order of a reference in the “hard cache”, the least recent was its utilization.

There will never be a reference in the “hard cache” to an object that does not exist in the WeakHashMap. The contrary, however, is allowed. In other words: an object may exist in the WeakHashMap even if there aren’t any references to it in the “hard cache”. In this case, that particular object would be available for disposal if it’s not referenced at all from outside the cache.

The quantity Q of references in the “hard cache” to a single cached object may be defined as:

Q = [ 1 (object was simply added to the cache but never requested from it) , N ]
N = [ total # of requests to that object , S ]
S = maximum size defined for the “hard cache”

From the definition of Q above, in an extreme scenario, nothing prevents that during any given period of time the “hard cache” be completely filled with references to one single object A in the cache. For that, these 3 conditions must be met by a consumer application:
  1. add object A to the cache;
  2. make N requests for A to the cache where N = S (size limit defined for the “hard cache”) – 1;
  3. the requests mentioned in the previous condition must be contiguous. They cannot intercalate with requests or additions of other objects.
Just to illustrate how both cache will work, consider this scenario:
Given objects A, B, C, D and E and the following sequence of operations on a cache with a maximum “hard cache” size limited to 10 elements:

   1: put ( “c” , C ); 
   2: put ( “a” , A ); 
   3: put ( “d” , D ); 
   4: get ( “a” ); 
   5: get ( “e” ); // returns null as “e” wasn’t added to the cache yet 
   6: get ( “a” ); 
   7: get ( “d” ); 
   8: put ( “b” , B ); 
   9: put ( “e” , E ); 
  10: get ( “b” );
  11: get ( “e” );
  12: get ( “a” );

Then the internal state of the cache would be:

Cache's internal state after the fiven sequence of operations

Notice that there are no references (arrows) to the map’s entry containing object C coming from the “hard cache”. Therefore, provided there aren’t any references to C from outside the cache either, if the GC was to execute now, there’s a fair chance the internal state of the cache would become like this:

Cache's probable internal state after a GC cycle

Don’t forget to get the sample code

For convenience, I’ve also made sample implementations of both caches explained available for download:
Beware of the obvious fact that, being sample implementations, they’re only intended to better illustrate the concepts described here and are not at all to be used as-is in real world applications. Even though they’re very good initial templates. Some of the issues I see for a high-end real world application are:
  1. the use of wrapper java.util.Collections.synchronizedCollections(…) as a solution for concurrent access to the internal WeakHashMap as opposed to a more fine grained and efficient use of java.util.concurrent.locks.ReentrantReadWriteLock to establish and control all needed critical regions.
  2. the use of a java.util.LinkedList as the internal “hard cache” has the main advantage of guaranteeing constant performance for the most used operations add(…) and getFirst() but the remove(…) operation on the other hand doesn’t. Actually, this is the main issue with these samples because it tends to add significant overhead when explicitly removing any object from the cache. I don’t really have any quick and ready solution for this particular problem but it could involve using a more sophisticated data structure and/or some parallel processing to mitigate the overhead.
So, enjoy and…
Keep reading!

* credits go to Mr. Eric Chan for the conceptual design and to Mrs. Khushboo Bhatia for the implementation of both caches.

2 comments:

Praveen Sharan said...

Very good article.Thanks for sharing this.

paphko said...

Thank you for sharing your ideas about more sophisticated caching solutions.
I have a suggestion for implementing the hard cache: what about using an array?
This should be minimal overhead and it stores hard references to the last n elements:

private int hardCachePointer = 0;
private MyType[] hardCache = new MyType[HARD_CACHE_SIZE];


Add: hardCache[hardCachePointer++ % hardCache.length] = key;
Removal is done automatically.
Clear: hardCache = new MyType[hardCache.length];

What do you think?