2013-10-25

The base difference between SQL and NoSQL databases. MS SQL Server vs. MongoDB.

If there is no real reason to resign from SQL database do not do this. This is the main rule J.

The main reason why you can ask youself why you need to move to NoSQL databases are necessity in huge very huge data storage (in other words Big Data), scalability and performance reasons.

But, you can ask yourself, what is the difference between them? Why I have to use one instead of another? The table below contains the brief difference explanation. More details beneath in the article.Relational databases were developed in 1970s. It was very courteous way to store data and satisfied those day needs. But today, relational databases are not able to solve the needs of the current scope for storing gigantic data. NoSQL technic for storing data is a substitute for solving the nowadays needs.

One of the most well-known and leading NoSQL database is MongoDB



MS SQL
MongoDB
Data Storage Model
Relational DBMS
Document-oriented
JOINs
Yes
No
Transaction
No
Support agile practices
No
Yes
Data schema
Fixed
Dynamic
Vertical
Horizontal
Yes (depending on software edition)
Primary-Secondary
No
Yes
Query Language
SQL query language
JSON query language
Secondary Indexes
Yes
Yes
Triggers
Yes
No
Foreign keys
Yes
No
Concurrency
Yes
Yes
Official Website
Company
Microsoft
MongoDB, Inc
Licence
Commercial
Open Source
Implementation language
C++
C++
OS support
Windows
Windows, Linux, OS X, Solaris
Drivers for programming languages
.NET, Java, PHP, Python, Ruby, Visual Basic
Actionscript, C, C#, C++, Clojure, ColdFusion, D, Dart, Delphi, Erlang, Go, Groovy, Haskell, Java, JavaScript, Lisp, Lua, MatLab, Perl, PHP, PowerShell, Prolog, Python, R, Ruby, Scala, Smalltalk
MongoDB is NoSQL database with large quantity of supported features. MongoDB in comparison with relational databases is more fast and easy-scalable.


There are a couple of features which are not supported: JOINs and global transactions. The main reason why MongoDB doesn’t support them is the way how it was designed.

Data Storage Model

Relational databases can’t live without JOINs and transactions. Normalization in SQL databases expected to save data in multiple tables and JOINs can help to combine them during requests. Also, transaction is the only way to be sure that data are consistent in different tables.  

MongoDB was designed to store a huge data (BigData), to be easy-scalable, high-available and fast. JOINs and global transactions were a stumbling-block for designing such system and the only way to get rid of them is to use another data storage model. It means to avoid data splitting between tables during normalization process. Key-value pairs combined into the document were chosen as data storage model. The data are stored mainly in one collection (set of documents) without necessity to JOIN and watch over data consistency. MongoDB is called document-oriented database.

Agile practices

Relational databases were not designed to support Agile software development and be easy scalable, but NoSQL databases on the other hand support them.

The modern world uses Agile practices to develop products, agile sprints with quick iteration is the one of the main goals. Dynamic schema is the key of Agile support in MongoDB.

Database Schema

Relational databases (SQL) require established data schemas for data storage, before data will be added. Schema necessity doesn’t fit with agile development approaches. It’s hard to predict an exact db schema at the beginning of the feature development. If schema is changed in relational database you must think over data migration to the new schema. If data is large it’s a very slow process that involves significant downtime. If data is changed regularly the downtime may also be frequent.
NoSQL database in contrast to relational databases doesn’t require a predefined schema and what is more it doesn’t require schema at all. We can call it schemaless database. This feature is fully fit into the Agile approaches. The same collection can contain data with a significantly differing structure.

Scalability

Relational database can be scaled just vertically, because entire database has to be hosted in a single server. This is necessary in order to ensure reliability and continuous availability of data. Vertical scaling is too expensive, places limits on scale and weak fault-tolerance.

NoSQL databases were designed to scale horizontally. Instead of increasing power of one single server you just need to add more server instances to get expected power.
Usually it's too expensive and limited to buy more RAM or more powerful server instead of just add one small workstation into the cluster.


Sharding is the process of storing data records across multiple machines. It can be configured for SQL and NoSQL databases. But sharding for SQL databases is not natively supported. It can be configured through complex arrangements for making hardware act as a single server, but NoSQL databases, on the other hand, usually support auto-sharding on the native layer. It means that they automatically spread data across shards (server), support balancing, query loading and etc.

Query Language

Relational databases use SQL as query language to retrieve data. It’s very powerful query language which was designed for managing data in a relational database management system. The scope of SQL includes data insert, query, update, delete, schema creation and modification and data access control.

MongoDB uses JSON-style declarative language for specifying what data to retrieve from database. This query language contains all amounts of SQL features even more. Here you can find mapping chart between SQL and MongoDB.

Summary

To chose one database or another depends of many factors. How much data are you expected to store, whether fault-tollerance is vital for your system and etc.

7 comments:

  1. A few follow up questions, if you can help ... With respect to Other NoSQL databases such as Cassandra and HBase ( column based) and Key Value Stores ..Where do they stand ( or can be plotted) in the Graph (Scalability Vs Features graph. Thanks

    ReplyDelete
  2. Index64 is the first ever in-memory concurrent key-value sore. There are several advantages of using Index64. Its speed is between two and ten times faster than any competitor. Range queries are as fast as a simple GET request.Lock-free and thread-safe algorithms gives a full availability on data.


    NoSQL database

    ReplyDelete
  3. Thanks for providing this informative information you may also refer.
    http://www.s4techno.com/blog/2016/07/27/introduction-to-sql/

    ReplyDelete