How Big a Difference Will This Make?
Long promised and now finally part of Redis functionality, clustering had to wait in line. There were two reasons for it coming out later than other functionality. First, user demand for other stable characteristics such as persistence, replication, latency and introspection (determining the structure of a database at run time) were even stronger than for clustering. Second, implementing clustering was a significant technical challenge. Redis database structures and commands are complex and operating requirements are for high throughput and low latency. A cluster architecture should also be hidden from a user’s application so that code can be run without modifications, while supporting unlimited scalability.
What a Redis Cluster Does
A Redis Cluster enables automatic sharding of data across multiple Redis nodes. If nodes fail or are unable to communicate, overall datastore operations can still continue. The data sharding strategy used means that keys can be resharded from one node to another while the cluster is in operation. The cluster can then survive certain types of failure. Users can therefore use Redis Cluster functionality to automatically split large datasets across nodes with a certain level of availability. However, commands that use multiple keys cannot be used in a cluster configuration. This would negatively affect performance and predictability of performance, because it would involve moving data from one Redis node to another.
Neither CP nor AP, but Somewhere In Between
Compared with the CAP model (consistency, availability and partitioning tolerance), the Redis Cluster trades off these characteristics in a way that makes it neither CP nor AP. Instead, it provides limited availability during partitions and eventual consistency. If nodes in the cluster become desynchronized because of partitions, they will eventually resynchronize for the value of a given key when the partition heals. It is however possible to lose write operations that are made during partitions. This is a deliberate design choice that reduces memory overhead and avoids limits on the use of APIs, while accepting less safety during partitions.
Competitors and Choices
The introduction of Redis Clusters puts Redis on a stronger footing compared to Memcached, one of its main in-memory key-value store rivals. Memcached already offered clustering. Now the user choice between the two will likely come down to performance differences in given applications or contexts, such as key value/string handling. However, other entities already put their own versions of Redis clustering in place before the news of the availability of the official version. Redislabs for instance describes its Redis Cloud as being built from the beginning to offer Redis clusters of any size that support all Redis commands. The Redis Cloud also offers Redis cluster replication, persistence, backup and auto-failover.
The Future for Redis Clustering
Redis (Salvatore Sanfilippo) has discussed possible new features for a future release of Redis Cluster. They include multi data center support, additional write safety and improved automatic node balancing. User feedback is also likely to play a significant role in determining what gets done when, just as it has already done so far for Redis overall.
You can read more at the Redis Cluster Tutorial along with a PDF that describes how it works at a high level.