While NoSQL databases are not all alike, there are certain tradeoffs common to them all:
- Data integrity
- Flexible indexing
- Interactive updating of data
- Concurrency guarantees
If I’d be at DatabaseJournal I’d click unpublish as fast as possible.
Original title and link: The Hidden Cost of Scaling With NoSQL… Spreading FUD ( ©myNoSQL)
Dr. C. Mohan’s first post about NoSQL databases:
Having worked in the database field for more than 3 decades with a fair amount of impact on the research and commercial sides of this field (see bit.ly/cmohan), it pains me to see the casual way in which some designs have been done and some supposedly new ideas get proposed/implemented. Not enough efforts are being made to relate these proposals to what has been done in the past and benefit from the lessons learnt in the context of RDBMSs. Not everything needs to be done differently just because it is supposedly a very different world now!
There are evolutionary and revolutionary products. And sometimes changing the perspective and starting from scratch is needed to validate or invalidate new or old time hypothesis. In the world of polyglot persistence there’s space for every solution that solves real problems. As perfect as one product could be it will not be able to address all the needs. The data storage space is not a zero-sum game. Winners don’t take it all.
Original title and link: The NoSQL Hoopla … What Is NonsenSQL About It? ( ©myNoSQL)
Source of data is Jaspersoft NoSQL connectors downloads. RedMonk published a graphic and an analysis and Klint Finley followed up with job trends:
Couple of things I don’t see mentioned in the RedMonk post:
if and how data has been normalized based on each connector availability
According to the post data has been collected between Jan.2011-Mar.2012 and I think that not all connectors have been available since the beginning of the period.
if and how marketing pushes for each connectors have been weighed in
Announcing the Hadoop connector at an event with 2000 attendees or the MongoDB connector at an event with 800 attendeed could definitely influence the results (nb: keep in mind that the largest number is less than 7000, thus 200-500 downloads triggered by such an event have a significant impact)
Redis and VoltDB are mostly OLTP only databases
Original title and link: NoSQL Databases Adoption in Numbers ( ©myNoSQL)
The Gazzang Encryption Platform for Big Data works as a last line of defense for protecting data within Hadoop, Cassandra and MongoDB, non-relational, distributed and horizontally scalable data stores that have become common management tools for big data initiatives.
Sounds good so far. But then:
Gazzang today launched a cloud-based encryption […] The Encryption Platform transparently encrypts and secures data “on the fly,” whether in the cloud or on premises, ensuring there is minimal performance lag in the encryption or decryption process.
Anyone having any idea how a cloud-based solution could encrypt/decrypt on premises data on the fly? I don’t.
Original title and link: Data Encryption for Hadoop and NoSQL Databases From Gazzang ( ©myNoSQL)
Can a skyscraper completed in 1931 be used to explain a parallel processing algorithm introduced in 2004? In this post, I use the anology of counting smartphones in the Empire State Building to explain MapReduce…without using code.
Andrew Brust’s metaphor is nice, but I wonder if these days there’s a single person coming even close to data that needs a 771 words description of how Map Reduce works.
Original title and link: A 771 Words Description of Map Reduce ( ©myNoSQL)
- We’are dealing with much more data.
- We require sub-second responses to queries
- We want applications to be up 24/7
- We’re seeing many applications in which the database has to soak up data as fast (or even much faster) than it processes queries
- We’re frequently dealing with changing data or with unstructured data
- We’re willing to sacrifice our sacred cows.
Not bad. But it reads more like the definition of Big Data.
Original title and link: 6 Reasons Why We Need NoSQL ( ©myNoSQL)
Based on this information (nb: the post is a short version of not all NoSQL databases are the same) I think the term “NoSQL” is doing all of the non-relational database options a disservice. The term “NoSQL” does help to argue with management that maybe a relational database is not the best option but that’s about where it’s usefulness ends.
I haven’t kept count of how many times I’ve heard this argument and its alternative “NoSQL is a (very) bad term”. What these seem to forget is that united under the NoSQL monicker the non-relational databases coped easier with all the attacks from detractors and brought them the deserved attention. Maybe it is a too wide term or even a meaningless one, but it served well in bringing awareness to polyglot persistence
Original title and link: The Generalization of “NoSQL” ( ©myNoSQL)
Here are a couple of examples of using MySQL in interesting (and it’s up to you whether unwise) ways:
- MySQL as a graph database, like Twitter’s FlockDB.
- MySQL as document store, like FriendFeed’s extremely custom schema design.
- MySQL as a key/value store. This lets you play with NoSQL concepts using MySQL.
Such examples abound. In fact most of the companies known for their contributions or using NoSQL databases run some sort of interesting relational database deployment. Most of the time these examples are interpreted as clear proof that relational databases can solve any problem. Reality is different though: the engineers’ long time familiarity with relational databases allowed them to ingeniously overpass their limitations when lacking better alternatives. But with NoSQL databases getting more mature every day, less and less problems require acrobatic usages of relational databases.
Original title and link: Examples of Using MySQL in Interesting Ways ( ©myNoSQL)
The first thing you need to realize is that you’re a victim of the SQL hammer. If having a hammer means everything is a nail, then it’s the same for SQL. Imagine, for instance, that Google’s search engine (which is trying to deliver information that is personal and relevant to you) was built by a group of SQL engineers. First, they would have designed a global data schema for all the information on the planet. Then they would have used the extract, transform and load (ETL) process and data-cleansing tools to bring all the information on the planet into their global SQL database. Finally, they would write reports such as: “Places to camp in France,” or “Chinese restaurants in Hickory, N.C.” After 10 years and tens of millions of dollars, the team would probably have given up. Fortunately, Google didn’t take that approach.
As much as I support the NoSQL space, I’ve already seen victims of the NoSQL hammer in only 2 and a half years. Both the hammer and new shiny toy syndroms are as present. One could even say that you cannot have the one without the other.
Original title and link: Victims of the SQL Hammer ( ©myNoSQL)
Before moving back to NoSQL databases, I wanted to stay in the land of file systems for a conversation between Jeff Darcy and David Strauss about the usage of file systems for large scale and high availability:
As I see it, aggregating local filesystems to provide a single storage pool with a filesystem interface and aggregating local filesystems to provide a single storage pool with another interface (such as a column-oriented database) aren’t even different enough to say that one is definitely preferable to the other. The same fundamental issues, and many of the same techniques, apply to both. Saying that filesystems are the wrong way to address scale is like saying that a magnetic #3 Phillips screwdriver is the wrong way to turn a screw. Sometimes it is exactly the right tool, and other times the “right” tool isn’t as different from the “wrong” tool as its makers would have you believe.
Original title and link: Scaling Filesystems vs. Other Things ( ©myNoSQL)
So, my proposal is this: take a step back from ORMs, and consider working more closely with SQL and a good database driver. Try to work with the database, and find out what it has to offer; don’t use layers of indirection to avoid knowing about the database. See what you like and don’t like about the process after an honest assessment, and whether ORMs are a real improvement or a distracting complication.
I know a lot of applications using ORMs that worked perfectly fine. And I know applications that had to go around the ORMs or even got rid completely of them.
Here is a parallel to think about: ORM vs SQL is similar to always using a relational database versus using the storage solution that better fits the problem—as in using a NoSQL database or going polyglot persistence. An ORM comes with the advantage of keeping you inside a single paradigm (object oriented) at the cost of not being able to (easily) use the full power of the underlying storage.
Original title and link: Taking a Step Back From ORMs and a Parallel to the Database World ( ©myNoSQL)
Marting Fowler and Pramod Sadalage in an infographic promoting their upcoming book (PDF):
Polyglot persistence will occur over the enterprise as different applications use different data storage technologies. It will also occur within a single application as different parts of an application’s data store have different access characteristics.
There are over 2 years since I’ve begun evangelizing polyglot persistence. By now, most thought leaders agree it is the future. Next on my agenda is having the top relational vendors sign off too. Actually, I’m almost there: Oracle is promoting an Oracle NoSQL Database and Microsoft is offering both relational and non-relational solutions with Azure. They just need to say it.
Original title and link: The Future is Polyglot Persistence ( ©myNoSQL)