Guide to Document Databases – NoSQL Explained

How do you store and retrieve different kinds of documents in traditional databases that use fixed data schemas? The answer is ‘with difficulty’. Traditional SQL or relational databases have rigidly defined ways of organizing their data. Once the schema is in place, any application using that database can only add or retrieve records that match that schema. However, another approach – that of the NoSQL (‘Not only SQL’) document database – allows a much more flexible solution.

Documents for Speed and Flexibility

NoSQL document databases use the concept of a document instead of a table or ‘relation’. They are designed to handle semi-structured data that simply don’t fit into relational databases. NoSQL databases in general have come to the forefront as the limitations of relational databases in an increasingly Big Data world have become apparent. Among further advantages of NoSQL databases or stores as they are also known are increased scalability and speed of data storage and retrieval. Whether you choose to use an RDBMS (relational database management system) or NoSQL database will depend on your application and your requirements. Both have roles to play.

Handling Document Structure

At the simplest level, document databases store pairs composed of a key and a ‘document’, which is a more or less complex data structure. A document can itself contain different pairs of key values, key arrays and possibly also nested documents. Compared with NoSQL key value stores where the objects held are not interpreted by the database system, NoSQL document stores do recognize the structure of the documents being held. Documents can also be grouped into collections. On the other hand, like a key value store (and other NoSQL databases), a document store can partition and replicate data for automatic recovery.

APIs and Document Encoding

An API or a query language then allows users and applications to retrieve documents based on their content, or the values of certain fields. The structure of the fields in the documents is dynamic. Users can freely modify, add or remove fields from existing documents. Frequently used document encodings to allow fields to be used like this include XML, JSON (JavaScript Object Notation), PDF and Microsoft Office formats (such as MS Word and Excel).

Possible Uses

Users of NoSQL document stores include publishing and media companies. They can store different text-heavy data on one platform for access for e-learning and research via content-driven applications. More generally, organizations that generate large collections of data see benefit in using NoSQL document databases to bring different data types together. Departments can cross-reference each other’s information so that enterprise-wide understanding is improved and silo thinking avoided.

Examples of NoSQL Document Database Systems

Popular NoSQL document database solutions include CouchDB, MongoDB and RavenDB. All of these solutions are Open Source, although they have different backgrounds and profiles. CouchDB is part of the Apache Software Foundation: it accepts queries using JavaScript and offers data views as simple JSON objects. It also offers eventual consistency: this is data synchronization between different machines on a ‘sooner or later’ basis.

By comparison, MongoDB offers both eventual and immediate consistency. In this sense, MongoDB is closer to an RDBMS and a guarantee of no data loss. While it runs under Linux, Mac OS X and Windows, MongoDB does not offer an Android version (CouchDB does). For Windows platform users, RavenDB offers similar NoSQL document store functionality, plus an integrated .NET API for application integration.

Leave a Reply

Your email address will not be published. Required fields are marked *


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>