Riak started life as an engine for a Salesforce automation venture and in 2009 Basho Technologies, the creator of Riak, made it an Open Source application. Riak has been designed with Amazon’s Dynamo database and the CAP theorem (Consistency-Availability-Partition Tolerance). This has led to its capabilities of fault tolerance through automatic data replication and distribution across clusters of systems. Like Dynamo, Riak also responds to requests rapidly even at terabyte data volume levels.
When Would You Choose Riak?
Applications that are very sensitive to downtime are opportunities for using Riak. Factory control systems and point of sale data collection are two possibilities. Riak brings a combination of high volume, high velocity and high variety data handling, with replication built into its architecture. It also offers tunable trade-offs for both replication and distribution via user-adjustable parameters. For example, the Riak ‘N value’ defines how many replicas of a piece of data are to be stored. It offers capabilities similar to those of Dynamo, while limiting complexity and possible ‘data bloat’.
Language, Libraries and Licensing for Riak
Riak is written using Erlang with C/C++. Besides the properties listed above, Erlang also contributes hot code loading (change code while the system is running) capability to Riak. Licensing is available in two flavors. The first is under Apache License 2.0 for the Open Source edition. The second is a commercial license from Basho for Riak with additional multisite replication and SNMP monitoring functionality. The programming libraries that can be used with Riak are currently Ruby, Java, Erlang, Python, PHP and C/C++. Riak CS is also available as an extension to Riak to provide a cloud-like object storage layer on top of the Riak platform.
Starting to Work with Riak
Riak runs under Linux, BSD, Mac OS X and Solaris operating systems. Hosting services providers today are Amazon Web Services, Engine Yard, Joyent, SoftLayer and Windows Azure. Once installed, the data store provides two methods for accessing data. The first is via HTTP operations with standard PUT and POST operations to write, and GETs to read data. The second is via an API based on Google’s Protocol Buffers specification. By specifying the bucket (‘table’) and key, applications can retrieve Riak objects directly and as a consequence also the most rapidly.
Who Uses Riak Today
According to statistics, 25 percent of the Fortune 50 has installed Riak for use. AT&T, AOL, Ask.com, Best Buy, Boeing, Bump, Braintree, DataPipe, Disqus, Gilt Groupe, Github, the UK National Health Service (NHS), OpenX, Rovio (of Angry Birds fame), Symantec, TBS, The Weather Channel, Workday, Voxer, Yahoo! Japan, and Yandex (Russian search engine and portal) are some examples of better-known organizations using this key value data store. In particular, Comcast uses it to store user profile data for its xfinity TV mobile application, the Danish Health Service uses it for medical prescription histories for all Danish citizens, and social network Yammer uses Riak for web server functionality for Github Pages.