distributed: All content tagged as distributed in NoSQL databases and polyglot persistence
This is a guest post by Greg Luck, Founder and CTO, Ehcache .
Is Ehcache a NoSQL store? No, I would not characterise it as that, but I have seen it used for some NoSQL use cases. In these situations it compared very well — with higher performance and more flexible consistency than the well-known NoSQL stores. Let me explain.
Ehcache is the de facto open source cache for Java. It is used to boost performance, offload databases and simplify scalability. Backed by the Terracotta Server Array, Ehcache becomes a linearly scalable distributed cache. It is a schema-less, key-value, Java-based distributed cache. It provides flexible consistency control, data durability, and with the release of Ehcache 2.4, search by key, value and attribute indexes.
From the very first integration of Ehcache and Terracotta, we enabled coherent data sharing across a cluster. In recognition of the CAP theorem and the limitations of having a single hard-coded trade-off, we created a rich consistency model configurable on a cache-by-cache basis. And to ease understanding, we adopted the standard client consistency model to describe and configure it.
We offer a much richer consistency model than NoSQL solutions. Across a cluster we offer on a per cache basis:
- Pessimistically locked strong consistency (the default)
- Unlocked Weak Consistency, with read-your-writes, monotonic reads and monotonic writes
- Optimistically locked Compare and Swap (“CAS”) atomic operations across the cluster
- XA and Local transactions
- An explicit locking API that allows custom consistency
The Ehcache architecture is very different from NoSQL architectures. With Ehcache, each application server JVM has a resident hot set of the cache determined using an LRU algorithm. The size of that hot set can be 4-6 GB for heap storage, and with BigMemory, can be hundreds of GBs. This is the Level 1 (“L1”) cache and is entirely in-process. Access from the L1 is less than 1 μs.
The entire cache is always stored in the Terracotta Server Array. This is the Level 2 (“L2”) cache. Access from the L2 is less than 2 ms.
The mileage you get from this architecture depends on the nature of your usage profile. The most common one, the Pareto distribution, will read from a correctly-sized L1 80% of the time, and from the L2 20% of the time. So the average latency for the most common case is less than .401 ms. By comparison, the Yahoo! Cloud Serving Benchmark shows the average latencies of HBase and Cassandra ranging from 8 to 18 ms.
That makes Ehcache an order of magnitude faster than NoSQL.
Caches can be set to be persistent, which gives durability and restartability. A write ahead log is used, and the cache will recover to a consistent state. When configured as a distributed cache, Ehcache uses the Terracotta Server Array as a Level 2 cache. Terracotta servers are usually deployed in pairs for each partition, giving HA. In addition, backups can be taken from Terracotta using our JMX tools, or directly from the underlying storage mechanism. Backups and recoveries can be done live. RDBMSs typically offer a very extensive tool set for archiving, ETL and so on. For that reason I think of Ehcache as being durable with a small “d”. Having said that, perhaps this also applies to many NoSQL implementations.
The definition of Big Data is a moving target. Today it is generally understood to start at a few dozen terabytes and go up into petabytes.
By this definition we do not do Big Data, but we are close. We have seen users take Ehcache up to about 2 TB with the current implementation. And if the past is anything to go by, each major release of Ehcache supports data volumes that have increased by an order or magnitude.
Another way to look at Big Data is that it is defined as data that is too large or unwieldy to process using RDBMSs. Because the mean latency from the cache is much lower for data retrieval, Ehcache works well for applications requiring rapid response, and thus serves big data hot sets well.
Finally, with BigMemory we can create very high densities. A Terracotta server can hold 250GB or more in memory per server. NoSQL solutions use a mix of memory and disk. Java based ones like Cassandra are subject to garbage collection issues, so are limited to running with very small heaps. BigMemory stores cache data off-heap but within the Java process using NIO’s DirectByteBuffer. The end result is that you can get the same storage using a much smaller server deployment.
Enter Distributed Caching
So if Ehcahce is not NoSQL, what is it? The answer is that Ehcache is a distributed cache. Like its NoSQL cousins, it is often used when the database cannot cope. But more than just data can be cached. For example, caching a web page or an expensive CPU computation are also common use cases.
The focus is on fast access to these cached results, not persistence. This fast access is expressed architecturally in Java as in-process caching, and over the network as in-memory cache stores.
Helping to draw the distinction, both Gartner and Forrester have in the last year created their respective definitions of Distributed Caching and Elastic Caching. According to Gartner, “Distributed caching platforms enable users to manage very large in-memory data stores to enable DBMS workload offloading, cloud and cloud transaction processing, extreme transaction processing, complex-event processing, and high-performance computing”. They also added distributed caches to their application Platform as a Service (“aPaaS”) reference architecture.
This makes sense. We see distributed caches being used alongside both RDBMSs and NoSQL.
But once the cache gets distributed and takes on enterprise features like search, the capabilities expand and overlap with NoSQL. Yet the difference in emphasis remains.
So what are some use cases where you might want to consider a distributed cache? In general, any use case where a ‘key-value plus search’-type NoSQL solution fits. As long as data volumes are less than 2 TB and the durability toolset is acceptable.
Specifically we see the following use cases:
General Purpose Caching
e.g. Hibernate Caching. JDBC caching. Web caching. Collection caching. We do this one pretty thoroughly.
In conjunction with an in-house or third-party analytics engine, very fast lookup of analytics results.
e.g. A credit card company needs to score real-time credit card transactions. There are hundreds of millions per day. Results of an in-house fraud model with transactions up to the close of business the previous day are loaded into the cache. The cache is further adjusted during the current business day for actual usage and can return fraud scores on billions of credit card numbers in a fraction of a second.
System of Record (“SOR”) for short to medium term business processes
e.g. A phone company does mobile phone contract processing and provisioning with rapidly changing plan details. They create a value object for each plan and persist that to the cache, avoiding database schema changes thus adding business agility. The final result of the provisioning is recorded in the database after approximately two weeks. The value in the cache is akin to the document in a Document Store.
In-memory dataset search, including applications where all data can be held in memory as well as those where a partial (or hot) dataset is in memory
e.g. A logistics company needs to lookup consignment nodes by id, sender name, addressee name and date range. They generate 400 GB per fortnight and 98% of searches are within two weeks. The database is overloaded handling the volume of lookups. By storing the most used data in the cache a 98% database offload can be achieved.
Distributed Caches and NoSQL Stores have been born out of the same need to supplement the RDBMS in powering web scale architectures. Both are cognizant of the CAP theorem and the impossibility of taking along the old certainties into a large scale distributed world.
While NoSQL is aimed at replacing the durability feature of RDBMS, caches are aimed at low latency and speed. Moreover, the cache is also seeking to avoid forcing the application to go out to any store, whether it be RDBMS or NoSQL.
I see distributed caches and NoSQL as being two useful and complimentary technologies that can supplement the RDBMS for what it cannot do, but also provide their own unique new features.
Detecting failures is a fundamental issue for fault-tolerance in distributed systems. […]
We present a novel abstraction, called accrual failure detectors, that emphasizes flexibility and expressiveness and can serve as a basic building block to implementing failure detectors in distributed systems. Instead of providing information of a boolean nature (trust vs. suspect), accrual failure detectors output a suspicion level on a continuous scale.
The architectural difference between traditional and accrual failure detectors:
Original title and link: Distributed Systems: The Phi Accrual Failure Detector Paper (NoSQL databases © myNoSQL)
Justin Sheehy talking about the Riak’s internals and how they’ve build Riak as a distributed, robust, scalable, and extensible NoSQL database
For Redis users, it shouldn’t be a surprise that Salvatore Sanfilippo (@antirez) has been working on Redis clustering lately. In the videos embedded below, Salvatore explains the details of Redis clustering .
Darren Wood presentation containing a mix of details on distributed graph databases and graph databases use cases:
According to Darren distributed graph databases characteristics are:
- High performance distributed persistence
- Ability to deal with remote data reads (fast)
- Intelligent local cache of subgraphs
- Distributed navigation processing
- Distributed, multi-source concurrent ingest
- Write modes supporting both strict and eventually consistency
You might want to check Marko Rodrigues’ Graph Databases: more than an introduction for a more complete list of possible graph databases use cases.
- Darren Wood, Chief Architect InfiniteGraph (↩)
Whilst CouchDB includes replication and a conflict-flagging mechanism, this is not the whole story for building an application which replicates in a way which users expect.