NoSQL Conference: All content tagged as NoSQL Conference in NoSQL databases and polyglot persistence
Saturday, 25 February 2012
Cassandra as the Central Nervous System of Your Distributed Systems with Joe Stein - Powered by NoSQL
In the 4th week of the DataStax’s Cassandra NYC 2011 video series, we have Joe Stein from Medialets talking about the architecture
Before diving into the video here are some interesting data points:
- Medialets serves rich media ads
- they handle 3-4TB of daily data
- microsecond-level response times
- Cassandra is used for time series and aggregate metrics
- all MapReduce jobs written in Python. This reminded me of the recent post about the performance impact of operations in Hadoop Map phase
-
Medialets architecture:

-
Major components of the Medialets’s architecture:
- Kafka
- MySQL
- Cassandra: 6 node cluster, 100k requests, single DC
- Hadoop
- ZooKeeper: coordinates all the services on the platform
- some of the data in MySQL is replicated in Cassandra (and coordinated with ZooKeeper)
- data is fed back to MySQL
- Kafka for collecting analytics data:
- aggregates go into Cassandra
- events in Hadoop
- GROUP BY with Cassandra
- for real-time systems aggregations must be done upfront
- the way data is segmented is critical
- aggregation leads to data explosion
Sunday, 19 February 2012
Cassandra at Clearspring with Chris Burroughs - Powered by NoSQL
For today’s Powered by Cassandra video from the Cassandra NYC 2011 event organized by DataStax, I chose Chris Burroughs’s presentation about Clearspring’s usage of Cassandra. Just in case you wonder what Clearspring is doing, the sharing buttons you see here on myNoSQL are powered by AddThis product from Clearspring.
Saturday, 18 February 2012
Cassandra 101 for System Administrators with Nathan Milford - Powered by NoSQL
While today was supposed to be a new educational video from the Cassandra NYC 2011 video series, I thought that learning from the lessons of operating Cassandra at Outbrain to serve over 30 billion impressions monthly will be quite educational.
Sunday, 12 February 2012
Scaling Video Analytics with Cassandra by Ilya Maykov - Powered by NoSQL
To keep with last week’s model—an educational video about Cassandra, followed by a Cassandra case study—today’s video in the Cassandra NYC 2011 video series from DataStax, is Ilya Maykov describe how Cassandra is used at Ooyala for computing multi-dimensional video analytics reports for 100M+ monthly unique users in near-real-time.
Saturday, 11 February 2012
Cassandra Data Modeling Examples with Matthew F. Dennis - NoSQL videos
Continuing the Cassandra NYC 2011 video series, made available by the folks from DataStax, this week we have Matthew F. Dennis which covers a couple of different Cassandra data modeling use cases.
Sunday, 5 February 2012
Cassandra at SocialFlow with Drew Robb - Powered by NoSQL
To alternate a bit after yesterday’s educational CQL: SQL for Cassandra in the Cassandra NYC 2011 video series from DataStax, today’s video is Drew Robb covering Cassandra usage at SocialFlow for capturing real-time data from Twitter and Bit.ly.
Saturday, 4 February 2012
CQL: SQL for Cassandra with Eric Evans - NoSQL videos
The fine folks from DataStax have made available the presentations from their Cassandra NYC 2011 event.
The first video to post here is Eric Evans’s presentation on Cassandra Query Language.
Friday, 1 July 2011
Hadoop Summit 2011 in Review
For those of us that haven’t been at the Hadoop Summit 2011:
Ryan Rosario
The main takeaway from Hadoop Summit 2010 was Cascalog. I predict the main takeaway from Hadoop Summit 2011 is Spark.
Anant Jhingran
My essential points are that the “birthers” (where hadoop has been born) and “adopters” (where hadoop will be used in enterprises) have a strong intersection today, modulo some extras on both sides…
However, at t = 3 years from now, we can either go separate ways because of different demands… or come together […]
Dave Cahill
[Hadoop] No longer a West Coast early adopter phenomenon. Hadoop isn’t quite mainstream, but almost, not quite at enterprise level purchasing but getting close.
Barton George interviewing with Eric Baldescwieler
A 4 minutes interview with the Eric Baldescwieler, CEO of Hortonworks, the Yahoo! Hadoop spin-off:
Announcements
- Cloudera Enterprise 3.5 : Full lifecycle management of Apache Hadoop deployments featuring the Service and Configuration Manager, Activity Monitor, Enhancements to Resource Manager and Authorization Manager
- Karmasphere Studio Community Hadoop Virtual Appliance for developers: a free virtual machine imagine including Apache Hadoop, Ubuntu Linux, the Eclipse IDE and Karmasphere Studio Community.
Last, but not least you can read Derrick Harris’ overview post .
Original title and link: Hadoop Summit 2011 in Review (©myNoSQL)
Monday, 8 November 2010
Notes from the MongoBerlin Conference
At least 6 MongoDB talks summarized on topics like: BRAINREPUBLIC MongoDB case study, MongoDB internals, MongoDB indexing and query optimizer, MongoDB sharding internals, MongoDB replication internals, and scaling with MongoDB. I’ve found the ones on MongoDB internals quite interesting:
query optimizer:
- it’s empirical, i.e. at first it tries all possible ways to get the results, and then remembers which one works best (it runs all algorithms in parallel and finishes as soon as one of them finishes), then reuses that knowledge in future requests
- if the selected algorithm becomes very slow, it tries all possible ways again
- so first time a query is called, it might be quite slow
- on the other hand, if something changes later, e.g. an index becomes slow, Mongo will work around that
Original title and link: Notes from the MongoBerlin Conference (NoSQL databases © myNoSQL)
via: http://psionides.jogger.pl/2010/10/10/notes-from-the-mongoberlin-conference/
Most Popular Articles
- Translate SQL to MongoDB MapReduce
- Tutorial: Getting Started With Cassandra
- CouchDB vs MongoDB: An attempt for a More Informed Comparison
- Cassandra @ Twitter: An Interview with Ryan King
- A Couple of Nice GUI Tools for MongoDB
- NoSQL benchmarks and performance evaluations
- Ehcache: Distributed Cache or NoSQL Store?
- Document Databases Compared: CouchDB, MongoDB, RavenDB
- Quick Review of Existing Graph Databases
- NoSQL Data Modeling