The best SQL vs NoSQL mindset I've ever heard
"Why would I want to use the technology that's optimizing the least expensive resource in the data center?"
TL;DR — SQL RDBMS is optimizing for storage. NoSQL is optimizing for computing power. Nowadays, computing power is expensive while storage is cheap.
I re-watched one of the best conference talks I’ve ever seen in my life. It was one hour talk by Rick Houlihan. It is called “Amazon DynamoDB Deep Dive: Advanced Design Patterns for DynamoDB”. And, it has the best intro about NoSQL databases.
If you are a beginner Data Engineer/Data Scientist/Software Engineer, I would say it is a must-see video for you.
NoSQL vs SQL
There is no actual battle between NoSQL and SQL, even if some people think there is one. There are cases when it is better to use SQL and cases when it is better to use a NoSQL solution.
The ability to correctly pick the right solution is what makes you a valuable engineer.
The key phrase from Rick’s talk is:
"Why would I want to use the technology that's optimizing the least expensive resource in the datacenter?" — Rick Houlihan
In the 20th century, the most expensive part of the database was storage.
7 years ago, I have read Jim Callins’s famous book Built to Last: Successful Habits of Visionary Companies. It was published in 1994. In its intro, it is written something like “We have collected 2 megabytes of data to be analyzed”. My reaction was “hah, any song in my playlist takes more than that”. I did not have to be a CS degree person to realize that two megabytes is nothing.
Today, storage cost is low. But, computing power is still expensive.
SQL joins require CPU. Big SQL joins require a lot of CPU
I have worked with a few SQL databases that had over one billion records. Even column-oriented databases are not performing well when you have to use joins. And, of course, they demand tons of CPUs in order to work well.
When to use NoSQL (DynamoDB)?
“NoSQL is not a flexible database. It is an efficient database.” — Rick Houlihan
You have to know all of your access patterns, all of your queries, all of your use-cases upfront.
Then, you model your NoSQL for these specific use-cases. You should always assume that any modification will not be possible after you kick-off the production.
With SQL, if you normalize your data correctly then you can do almost all the possible transformations with your data. You do not have this opportunity with NoSQL.
I do such recaps of what I’ve learned today about 5 days a week.
Subscribe if you want to receive hand-picked reviews like this one. It is free.
If you want to also have access to my long-reads about how I build end-to-end solutions you might want to become a paid subscriber. I just moved out of Medium, so I decided to make an 80% discount for my first 100 paid subscribers here.
The statement that with nosql one must know your queries/use-cases upfront is very true. Am a certified Mongo DBA and that's something you see. Any attempt to do joins via Lookup methods is difficult to accomplish.