SQL vs. NoSQL
According to our recent Java developer survey, most developers are using SQL or SQL-based relational database management systems (RDBMS) or procedural extensions. But, is that reflective of the overall distribution of “big data” and “small data” applications? Or does it reflect the quality of SQL vs. NoSQL?
In this article, we’ll discuss some of the reasons why SQL RDBMS are a popular choice for developers, cases where NoSQL may be a better option, and weigh the pros and cons of SQL and NoSQL for Java developers today. Let’s get started with an overview of SQL.
What Is SQL?
The Structured Query Language, or SQL, is a domain-specific language, or DSL, used to interact with relational databases and relational data streams. It’s often employed as the underlying DSL for RDBMS or procedural extensions like MySQL, Oracle DB, or PostgreSQL.
History of SQL
SQL (originally called SEQUEL) was developed by Raymond Boyce and Donald Chamberlain for IBM in the 1970s. Since then, it has become a bedrock of database management and offers functionality that extends it beyond a typical DSL.
Benefits of SQL
- ACID Transactions – SQL is ideal for situations where databases, regardless of where they sit, need atomicity, consistency, isolation, and durability, or ACID.
- Widely-adopted – SQL is used in most RDBMS and as a basis for several popular procedural extensions.
- Extended Functionality – SQL is widely-adopted, and with many options for control-of-flow extensions and procedural and OOP extensions (both paid and unpaid) SQL can be applied in functionally diverse ways.
- ANSI and ISO Compliant – SQL, as a DSL, is ANSI and ISO compliant, with NIST self-certification for individual RDBMS and procedural extensions.
Drawbacks of SQL
- Lack of Standardized Interfaces – Not all RDBMS and procedural extensions follow SQL standards, which can increase vendor lock-in and decrease interoperability.
- Proprietary Extensions – Popular extension vendors often lock functionality behind paid versions.
- Vertical Scaling – SQL, because of how it approaches data architecturally, is hard to scale horizontally. It requires hardware to scale, adding cost and a hard performance ceiling.
What Is NoSQL?
NoSQL is a database that facilitates the storage and retrieval of data. Commonly used for applications that require horizontal scaling, like real-time web applications, it relies on a non-SQL low-level query language to store and retrieve data.
History of NoSQL
NoSQL was originally developed as an alternative to SQL, and eschewed the tabular relations used in relational databases in favor of unique, and more flexible (albeit more specialized) databases. While originally conceived as an alternative to SQL, NoSQL systems often adopt SQL-like languages.
Benefits of NoSQL
- Horizontal Scaling – Because NoSQL treats documents as self-contained, they can exist on multiple servers without join constraints.
- Big Data – For applications that require a large amount of data to be stored and accessed, the overhead with vertical scaling RDBMS can be costly. For applications processing large amounts of data, NoSQL can prove to be a more stable and cost-effective solution.
- Specialized Architecture – NoSQL is designed for performance and scalability, and in situations where data isn’t predictable. For situations where those are the biggest needs, it’s often the best available choice.
- Lack of ACID – Not having ACID transaction constraints means fast performance and allows for horizontal scalability.
Drawbacks of NoSQL
- Lack of Adoption – NoSQL isn’t as ubiquitous as SQL, and part of that has to do with the limitations of NoSQL. The use of low-level query languages and the sheer volume of SQL adoption are big barriers to widespread adoption.
- Specialized – As mentioned above, NoSQL is a great fit for applications dealing with large amounts of (often unpredictable or disorganized) data that need to scale without adding significant cost. But outside of those situations, SQL-based solutions are often the better choice.
- Low-Level Query Language – While some NoSQL databases have extended NoSQL to allow for Joins, NoSQL doesn’t have the versatility, or interoperability of SQL.
- Lack of ACID – While this is more of an outdated drawback for NoSQL, not all NoSQL databases have true ACID transactions. It's also a benefit in the right situations (as noted above).
SQL vs. NoSQL
When developers compare SQL vs. NoSQL, the argument often comes back to a central theorem of database functionality – the CAP theorem. This theorem argues that it’s impossible for a database to meet the needs of consistency, availability, and partition tolerance.
|Gives true ACID transactions||Suitable for Horizontal scaling|
|Widely-adopted||Good for "big data"|
|Extensible||Optimized for performance|
|ANSI and ISO compliant||Optimized for scalability|
Looking at SQL, it would meet the criteria for consistency (since all databases can accurately reflect the most recent changes to that database). But, because of the lack of partitioning, it has a ceiling that can affect availability. NoSQL, on the other hand, typically wouldn’t meet the criteria for consistency (since all partitioned databases aren’t reflective of the changes to a single database).
When to Use SQL vs. NoSQL?
Because of the strengths and weaknesses of RDBMS vs. NoSQL, the decision on which database type to use should be made service by service.
When to Use SQL
- When You Need Data Consistency (ACID).
- When You Have Wide, Complex Data Queries.
When to Use NoSQL
- When you don't care about the data consistency (lack of ACID).
- When you need to read/write data fast.
- When you want to scale the database horizontally.
- When you work with a big amount of data.
When it Makes Sense to Use Both
Imagine a streaming video application with large, active data sets (that isn’t using Cassandra). The application will need to handle multiple types of data, with some of that data needing true ACID transactions (think passwords, or payment information), and some that don't need ACID transactions but are used frequently and require high performance, (think recommending videos based on viewing history).
Using an RDBMS as the database for the entire application would be unfeasible, just as handling password and payment information with the lack of ACID transactions in NoSQL would be the wrong choice. Developers should use databases based on the needs of the data, and with consideration for application performance and scalability.
Many developers thought that NoSQL was going to replace SQL. And, in some projects, it has. But more and more projects need to handle diverse data types that need both RDBMS and NoSQL. The decision of when to use NoSQL vs. NoSQL should be made service by service and based on the needs of the data being processed, and the performance of the combined application.
Looking for additional reading on databases? Our recent article on Hibernate ORM gives a good overview of how differing database types can be reconciled via object-relational mapping, and the problems that come with it.
Improving SQL and NoSQL Performance
Looking for ways to improve your database performance? Try XRebel. Used during development, it can help developers to trace inefficient database requests, even in distributed applications.