NoSQL vs SQL
Couchbase came up at work, so I decided it was finally time to dig in and find out what it was all about. At a high level I've known the gist for quite some time. But I had never really read anything beyond tech articles on the topic.
So, over to the Couchbase website I went. What I saw was ludicrous.
What I mean is not that I believe NoSQL is bad by any means. But, they are pitching it as rather universally superior to traditional relational databases. And they do this with an overly simplified example. A single user resume with 3 skills to 2 past experiences.
They complain that in the relational model this would require 6 rows in 3 tables (true) and that in NoSQL it requires just 1 record in one table (also true) and uses this to proclaim the superiority of NoSQL.
And that is where they lost me.
When talking about a single record, with an entity with a hierarchy just 2 levels deep it is pretty darn easy to think that NoSQL is globally superior. BUT!!!! I work with tables with hierarchies that go down 5-10 levels or more. And often times I need to query something in one of the lower levels.
I don't believe a NoSQL is inherently faster at this. And the queries would not be pretty.
Then there is the MASS data duplication. RDBMS allows you to normalize your data. It is actually because of that normalization that you end up with 6 records in 3 tables, rather than 1. And while it LOOKS absolutely terrible when you take a single instance, when data is properly normalized, at volume, the actual consumption will *generally* be FAR better than a comparable NoSQL approach.
Individual "tables" in a normalized RDBMS system are also generally smaller and any level in a hierarchical data model can be accessed directly, making such queries MUCH faster.
Do you need to do an ass load of joins to get all of the data back? Sure. But the trade-off is finding the data is simpler and faster. And in some cases, even rebuilding the data model with a SQL approach will be quicker than the equivalent NoSQL approach.
Oh, and did I mention, with the RDBMS approach, you can choose to only retrieve the information you actually care about, as you care about it. Once again, another opportunity for the NoSQL to shave off overall time spent. With a document based data model, when you retrieve a document, you retrieve the whole document.
For a solution which talks an awful lot about scale, it neglects the data bloat, which at scale can translate into unnecessary bandwidth consumption, CPU cycles and storage costs.
This isn't to slam NoSQL. It absolutely will crush SQL... in certain areas. RDBMS solutions aren't readily adaptable. If your data models need a bit of flexibility, then you will benefit from NoSQL. If you data models aren't super deep, and you generally need ALL of the data, you will most likely benefit from NoSQL. And I'm sure there are a host of other benefits to the approach. But, talking about RDBMS like some antiquated notion is folly. It also has it's place.
The site then goes on to espouse the benefits in terms of scalability and redundancy, citing that in SQL solutions these are often complex to configure and often come at an extra cost. And while that is generally true it is A) not universally true and B) nothing inherent in RDBMS architecture. If, in an alternate dimension, the Couchbase guys had decided to make yet another RDBMS, they could have made these same considerations there.
At the end of the day, both SQL and NoSQL solutions are table based data storage solutions. There is absolutely no reason why scalability or redundancy solutions applied to NoSQL could not also be applied to SQL solutions. And, the page admits they exist. Cost and complexity are both things which CAN be addressed.
So, over to the Couchbase website I went. What I saw was ludicrous.
What I mean is not that I believe NoSQL is bad by any means. But, they are pitching it as rather universally superior to traditional relational databases. And they do this with an overly simplified example. A single user resume with 3 skills to 2 past experiences.
They complain that in the relational model this would require 6 rows in 3 tables (true) and that in NoSQL it requires just 1 record in one table (also true) and uses this to proclaim the superiority of NoSQL.
And that is where they lost me.
When talking about a single record, with an entity with a hierarchy just 2 levels deep it is pretty darn easy to think that NoSQL is globally superior. BUT!!!! I work with tables with hierarchies that go down 5-10 levels or more. And often times I need to query something in one of the lower levels.
I don't believe a NoSQL is inherently faster at this. And the queries would not be pretty.
Then there is the MASS data duplication. RDBMS allows you to normalize your data. It is actually because of that normalization that you end up with 6 records in 3 tables, rather than 1. And while it LOOKS absolutely terrible when you take a single instance, when data is properly normalized, at volume, the actual consumption will *generally* be FAR better than a comparable NoSQL approach.
Individual "tables" in a normalized RDBMS system are also generally smaller and any level in a hierarchical data model can be accessed directly, making such queries MUCH faster.
Do you need to do an ass load of joins to get all of the data back? Sure. But the trade-off is finding the data is simpler and faster. And in some cases, even rebuilding the data model with a SQL approach will be quicker than the equivalent NoSQL approach.
Oh, and did I mention, with the RDBMS approach, you can choose to only retrieve the information you actually care about, as you care about it. Once again, another opportunity for the NoSQL to shave off overall time spent. With a document based data model, when you retrieve a document, you retrieve the whole document.
For a solution which talks an awful lot about scale, it neglects the data bloat, which at scale can translate into unnecessary bandwidth consumption, CPU cycles and storage costs.
This isn't to slam NoSQL. It absolutely will crush SQL... in certain areas. RDBMS solutions aren't readily adaptable. If your data models need a bit of flexibility, then you will benefit from NoSQL. If you data models aren't super deep, and you generally need ALL of the data, you will most likely benefit from NoSQL. And I'm sure there are a host of other benefits to the approach. But, talking about RDBMS like some antiquated notion is folly. It also has it's place.
The site then goes on to espouse the benefits in terms of scalability and redundancy, citing that in SQL solutions these are often complex to configure and often come at an extra cost. And while that is generally true it is A) not universally true and B) nothing inherent in RDBMS architecture. If, in an alternate dimension, the Couchbase guys had decided to make yet another RDBMS, they could have made these same considerations there.
At the end of the day, both SQL and NoSQL solutions are table based data storage solutions. There is absolutely no reason why scalability or redundancy solutions applied to NoSQL could not also be applied to SQL solutions. And, the page admits they exist. Cost and complexity are both things which CAN be addressed.
Comments
Post a Comment