NoSQL: All content tagged as NoSQL in NoSQL databases and polyglot persistence
Three presentations covering the various NoSQL usages at Twitter:
Kevin Weil talking about data analysis using Scribe for logging, base analysis with Pig/Hadoop, and specialized data analysis with HBase, Cassandra, and FlockDB on InfoQ
Ryan King’s presentation from last year’s QCon SF NoSQL track on Gizzard, Cassandra, Hadoop, and Redis on InfoQ
Dmitriy Ryaboy on Hadoop from Devoxx 2010:
- Twitter: Cassandra, HBase, Hadoop, Scribe, FlockDB, Redis
- Facebook: Cassandra, HBase, Hadoop, Scribe, Hive
- Netflix: Amazon SimpleDB, Cassandra
- Digg: Cassandra
- SimpleGeo: Cassandra
- StumbleUpon: HBase, OpenTSDB
- Yahoo!: Hadoop, HBase, PNUTS
- Rackspace: Cassandra
And probably many more missing from the list. But that could change if you leave a comment.
A panel discussion on NoSQL, NoSQL databases, and relational databases, featuring Salvatore Sanfilippo
, Lenz Grimmer
, Filipe David Borba Manana
, and a forth person from SAPO whose name I couldn’t spell:
Featuring Jacob Burch, Alex Gaynor, Eric Florenzano, Jacob Kaplan-Moss, Michael Richardson, Noah Silas
You probably know already that Django and NoSQL is hot!
- Unfortunately it looks like the real people on the panel aren’t the same with the ones listed. (↩)
Story and script by Latha Annur Subramaniam:
In case you have a hard time reading it:
RDBMS (SQL) was worried when the news about a new technology called NoSQL
SQL: Oh, he is this new NoSQL guy. People say he is here to beat me out. Hmm. I just hate him!
NoSQL: Howdy, senior SQL. How are you?
SQL: um… uh… oh Hi young man. Looks like you are new to this place?
NoSQL: Oh yeah! Just out of the ‘Latest computing trends’ school.
SQL: He is just a fresher. But I am his great grand senior. He can never take me down.
SQL: Hey, your name NoSQL sounds strange. Sounds like you are an anti-SQL guy.
NoSQL: Hm… true. I fell so unfortunate of my name. But I am never al alternate to you. In short, I am a new solution for the fresh new problems of this computing era… the “WEBSCALE” era.
SQL: (Hey he sounds modest. Am kinda like this guy) Oh. Am hearing this term for the first time. What is this W-E-B SCALE thing all about?
NoSQL: Interestingly, these days humans lead a much active social life on the WEB only.
NoSQLL Just like in their real life, people always need more and more of everything. Tweet, Search, Maps, Blog… their needs never end ;-)
SQL: Hmmm. Now I get it. I’ve been the darling for the enterprises for their data storage needs. But maybe they will abandom me and choose you, when they need more scale?!?!
NoSQL: Partially true. I can help them in scaling massively. But you are still the best in a lot of things.
NoSQL: For example, you are the Superstar when it comes to ‘transaction based apps’. I can never beat you in your ACID qualities
NoSQL: Also, I am still not the best for ‘Reporting’ requirements. While my ‘schemaless’ quality helps dynamically add different types of data, it causes the drawback of not being helpful for reporting.
SQL: I fell you are the right fit for the modern social apps.
NoSQL: You are the right bet for the critical business apps… soon until I catch up with you
SQL: Yup. I wish you good luck, young man.
NoSQL: Thank you, senior. Btw, my name doesn’t mean a NO to SQL!!! It is only that I am NOT only SQL :-)
SQL: and so I dedicate this song to you buddy:
Twinkle, twinkle NoSQL Was wondering who you are Out into this computing world, I wish you success all around!!!
Definitely not as good as MongoDB is web scale.
Something to think about it:
- if you are using some caching in your application, would you call that the persistence layer?
- if you are using a distributed cache, would you call that your persistence layer?
- if you are using a replicated and distributed cache, would you call that your persistence?
- if your replicated and distributed cache does some sort of snapshotting to disk, would you call that your persistence?
Some are saying ☞ RAM is the new disk, so I’m wondering what their answers to the above questions are.
For people who don’t really grok what’s been said in this post (maybe because it was just too long to read), my recommended setup is: “Use Redis for small datasets that don’t grow fast (stay far less than 1GB). Have at least 2x memory than the dataset. Use default snapshotting and disable AOF.”
Considering this time I was one of those that didn’t really follow the first part of the article, filing it under “not sure what all these have to do with Redis persistence implementation”, I’ve found Jeff Darcy’s ☞ follow up adding a bit more context(!) to the discussion:
I’d rephrase above as “Use Redis for small datasets (less than 50GB this year) that don’t need to be highly available, have memory at least 2x your actual dataset (until the snapshot implementation improves), use frequent snapshotting or AOF (depending on your need for performance vs. durability – not both) and always avoid overcommit.” I also have nothing against Redis, it’s a fine tool for what it does, but I think its durability story is a bit confused and its reinvented VM can only serve a need that it’s not good for anyway. As always, the real answer is to use multiple data stores to serve multiple needs, with careful consideration of the tradeoffs each represents.