Ehcache Travel Mug

Like Ehcache and want to show your support? We have a wide range of Ehcache merchandise including T-Shirts, mousemats, coffee mugs and lots elese. Available for worldwide shipping.

Buy merchandise from the Ehcache store at Cafe Press.

Distributed Caching with ehcache

Ehcache provides a pluggable distributed caching mechanism. This enables for multiple CacheManagers and their caches in multiple JVMs to share data with each other.

Pluggable Mechanisms

Ehcache has a pluggable cache replication scheme which enables the addition of cache replication mechanisms.

The following distribution mechanisms are supported in ehcache 1.5:

  • RMI
  • JGroups
  • JMS
  • Terracotta
  • Cache Server

Each of the is covered in its own chapter.

The need for shared cache data

Many production applications are deployed in clusters. If each application maintains its own cache, then updates made to one cache will not appear in the others. A workaround for web based applications is to use sticky sessions, so that a user, having established a session on one server, stays on that server for the rest of the session. A workaround for transaction processing systems using Hibernate is to do a session.refresh on each persistent object as part of the save. session.refresh explicitly reloads the object from the database, ignoring any cache values.

Replicated Caches

One solution is to replicate data between the caches to keep them consistent, or coherent. Typical operations which Applicable operations include:

  • put
  • update (put which overwrites an existing entry)
  • remove

Update supports updateViaCopy or updateViaInvalidate. The latter sends the a remove message out to the cache cluster, so that other caches remove the Element, thus preserving coherency. It is typically a lower cost option than a copy.

Using a Cache Server

Ehcache 1.5 supports the Ehcache Cache Server.

To achieve shared data, all JVMs read to and write from a Cache Server, which runs it in its own JVM.

To achieve redundancy, the Ehcache inside the Cache Server can be set up in its own cluster.

This technique will be expanded upon in Ehcache 1.6.

Notification Strategies

The best way of notifying of put and update depends on the nature of the cache.

If the Element is not available anywhere else then the Element itself should form the payload of the notification. An example is a cached web page. This notification strategy is called copy.

Where the cached data is available in a database, there are two choices. Copy as before, or invalidate the data. By invalidating the data, the application tied to the other cache instance will be forced to refresh its cache from the database, preserving cache coherency. Only the Element key needs to be passed over the network.

Ehcache supports notification through copy and invalidate, selectable per cache.

Potential Issues with Distributed Caching

Potential for Inconsisent Data

Timing scenarios, race conditions, delivery, reliability constraints and concurrent updates to the same cached data can cause inconsistency (and thus a lack of coherency) across the cache instances.

This potential exists within the ehcache implementation. These issues are the same as what is seen when two completely separate systems are sharing a database; a common scenario.

Whether data inconsistency is a problem depends on the data and how it is used. For those times when it is important, ehcache provides for synchronous delivery of puts and updates via invalidation. These are discussed below:

Synchronous Delivery

Delivery can be specified to be synchronous or asynchronous. Asynchronous delivery gives faster returns to operations on the local cache and is usually preferred. Synchronous delivery adds time to the local operation, however delivery of an update to all peers in the cluster happens before the cache operation returns.

Put and Update via Invalidation

The default is to update other caches by copying the new value to them. If the replicatePutsViaCopy property is set to false in the replication configuration, puts are made by removing the element in any other cache peers. If the replicateUpdatesViaCopy property is set to false in the replication configuration, updates are made by removing the element in any other cache peers.

This forces the applications using the cache peers to return to a canonical source for the data.

A similar effect can be obtained by setting the element TTL to a low value such as a second.

Note that these features impact cache performance and should not be used where the main purpose of a cache is performance boosting over coherency.

Use of Time To Idle

Time To Idle is isconsistent with distributed caching. Time-to-idle makes some entries live longer on some nodes than in others because of cache usage patterns. However, the cache entry "last touched" timestamp is not replicated across the distributed cache.

Do not use Time To Idle with distributed caching, unless you do not care about inconsistent data across nodes.