facebook: All content tagged as facebook in NoSQL databases and polyglot persistence
As everyone probably knows by now, Cassandra was originated at Facebook as a solution for inbox search and then open sourced under the ASF umbrella and an Apache license. Since then, Twitter, Digg, Reddit and quite a few others started using it, but not much have been heard from Facebook.
So, in case you are wondering ☞ what’s up with Cassandra here’s a very concise update:
- Twitter and Digg are not planning to fork the project. In fact there are clear plans to contribute back their work on Cassandra (see this for more details)
- Facebook is still using Cassandra internally for the inbox search, but they are using their own version
- even if except the initial code share Facebook has stopped contributing to the Cassandra project, the community on ASF is doing well (read growing)
- Riptano, the company founded by Cassandra project lead Jonathan Ellis and Matt Pfeil, is offering technical support, professional services, and training for Cassandra
Update: interesting ☞ note (dated July 7th) from Twitter’s engineer, Nick Kallen:
Twitter no longer intends to use Cassandra for any critical data-stores in the near term future.
Lately I’ve been mentioning Hive quite a few times when writing about working with NoSQL data, but I was missing a good slidedeck providing details of the Hive architecture, usage scenarios, and other interesting details about Hive.
The presentation embedded below coming from the Facebook Data Infrastructure team provides all these details and much more (i.e. Hive usage at Facebook, Hadoop and Hive clusters, etc.)