NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



monitoring: All content tagged as monitoring in NoSQL databases and polyglot persistence

Tuning MongoDB Performance with MMS


At MongoLab we manage thousands of MongoDB clusters and regularly help customers optimize system performance. Some of the best tools available for gaining insight into our MongoDB deployments are the monitoring features of MongoDB Management Service (MMS). […] Here we focus primarily on the metrics provided by MMS but augment our analysis with specific log file metrics as well.

There’s definitely something you can learn from guys whose business is running MongoDB.

✚ I continue to be impressed with 10genMongoDB Inc.’s MMS service.

Original title and link: Tuning MongoDB Performance with MMS (NoSQL database©myNoSQL)


HBase migration to the new Hadoop Metrics2 system

Elliott Clarke explains a bit the work that his doing in migrating the HBase metrics to Hadoop’s Metrics2 system:

As HBase’s metrics system grew organically, Hadoop developers were making a new version of the Metrics system called Metrics2. In HADOOP-6728 and subsequent JIRAs, a new version of the metrics system was created. This new subsystem has a new name space, different sinks, different sources, more features, and is more complete than the old metrics. When the Metrics2 system was completed, the old system (aka Metrics1) was deprecated. With all of these things in mind, it was time to update HBase’s metrics system so HBASE-4050 was started. I also wanted to clean up the implementation cruft that had accumulated.

The post is more about the specific implementation details than the wide range of metrics HBase already supports and how this new system would unify and allow extending it.

Original title and link: HBase migration to the new Hadoop Metrics2 system (NoSQL database©myNoSQL)


How to Monitor MongoDB

A post by Pandora FMS team about monitoring options for MongoDB:

If MongoDB goes wrong, all your apps will fail. So monitoring the main variables and configuration parameters of your database is the best option to make sure that your values are right and your users are happy.

The post talks about the tools that come with MongoDB (mongotop, mongostat, the stats available through the MongoDB shell, logfiles, etc.), but also introduces their PandoraFMS library. There’s no word about 10gen’s hosted MongoDB Monitoring Service, nor other MongoDB utilities for monitoring or the latest MongoMem collection memory usage library.

In terms of what are the most interesting stats, Simon Maynard’s 5 things to Monitor in MongoDB and these other 3 metrics should be a good start.

Original title and link: How to Monitor MongoDB (NoSQL database©myNoSQL)


Using MetricFactory to Get Hadoop Metrics Into Graphite

Staying on the topic of monitoring, Jelle Smet writes about getting Hadoop metrics into Graphite using MetricFactory:

Using this setup we can accept Ganglia metrics over UDP from Hadoop, convert using multiple parallel processes the metrics to Graphite format in and submit the converted metrics in batches to Graphite. I’m planning to add more functionality to MetricFactory. Currently it can tackle mod_gearman and Ganglia data. Using the examples in this article you should be able to setup your own MetricFactory based setups relatively easy.

Original title and link: Using MetricFactory to Get Hadoop Metrics Into Graphite (NoSQL database©myNoSQL)


Riak Metrics With Folsom

In previous versions of Riak the same process gathered metrics and calculated statistics. In certain situations, under load, reading statistics would slow down, or even timeout altogether. The call to read stats would block that same process that updates stats, leading to large message queue backlogs.

This is the sort of observation and improvement that only a product that got into production (heavy production) could make.

Original title and link: Riak Metrics With Folsom (NoSQL database©myNoSQL)


Visualizing Systems Behavior: Subsecond Offset Heat Maps

This one is for ops, devops, noops and everyone else that loves some state-of-art real-systems behavior visualizations and investigation:

The subsecond offset heat map puts time on two axes. The x-axis shows the passage of time, with each column representing one second. The y-axis shows the time within a second, spanning from 0.0s to 1.0s (time offsets). The z-axis (color) show the count of samples or events, quantized into x- and y-axis ranges (“buckets”), with the color darkness reflecting the event count (darker == more).

Visualizing Systems Behavior: Subsecond Offset Heat Maps


Original title and link: Visualizing Systems Behavior: Subsecond Offset Heat Maps (NoSQL database©myNoSQL)


Netezza Query History Table

Using Netezza’s in-database analytics package FPGROWTH, database administrators can identify the most commonly used combination of tables and the performance of the queries that reference those sets of tables.

Nice feature. Sort of the rich men’s all-included slow query log in MySQL. Do you know if other databases support a similar feature?

Original title and link: Netezza Query History Table (NoSQL database©myNoSQL)


What Can Be Learned From Heroku Outage Postmortem

While some may learn a few new things or get a confirmation in the very details of the outage, what caught my attention in the Heroku’s postmortem analysis is the conclusions:

  • higher sensitivity and more aggressive monitoring on a variety of metrics
  • improved early warning systems
  • better containment
  • improved flow controls, both manual and automatic
  • expanding simulations of unusual load conditions in our staging environment

None of these are particular to a specific storage or NoSQL database. But they all reflect the reality of operating at large scale where even the most operationally friendly solutions—think of Dynamo-inspired NoSQL databases—cannot and should not be left unmonitored or unsupervised or with no clear recovery strategies and processes in place.

In the NoSQL world, one of the most covered outages was the MongoDB outage at Foursquare. And in case you don’t remember the details, most of the circumstances that led to that event could have been prevented by having:

  1. better monitoring
  2. early warnings
  3. better operational procedures

Aren’t these two lists looking very alike?

Original title and link: What Can Be Learned From Heroku Outage Postmortem (NoSQL database©myNoSQL)


Amazon Elastic MapReduce New Features: Metrics, Updates, VPC, and Cluster Compute Support

Starting today customers can view graphs of 23 job flow metrics within the EMR Console by selecting the Monitoring tab in the Job Flow Details page. These metrics are pushed CloudWatch every five minutes at no cost to you and include information on:

  • Job flow progress including metrics on the number of map and reduce tasks running and remaining in your job flow and the number of bytes read and written to S3 and HDFS.
  • Job flow contention including metrics on HDFS utilization, map and reduce slots open, jobs running, and the ratio between map tasks remaining and map slots.
  • Job flow health including metrics on whether your job flow is idle, if there are missing data blocks, and if there are any dead nodes.

That’s like free pr0n for operations teams.

On a different note, I’ve noticed that the Hadoop stack (Hadoop, Hive, Pig) on Amazon Elastic MapReduce is based on second to last versions, which says that extensive testing is performed on Amazon side before rolling new versions out:

Original title and link: Amazon Elastic MapReduce New Features: Metrics, Updates, VPC, and Cluster Compute Support (NoSQL database©myNoSQL)


10gen’s MongoDB Monitoring Service: Smart Move

You’ve probably heard of the free MongoDB monitoring service launched by 10gen: MMS, docs, and ToS.

MongoDB Monitoring Service MMS by 10gen

Leaving aside that this is a useful tool for both developers and ops people, it is also a very useful tool for 10gen to monitor and understand MongoDB adoption. A hosted monitoring system will provide 10gen with good insights into what kind of workloads and data sizes MongoDB is handling, not to mention details about frequent issues MongoDB users are facing. Last, but not least, with an SLA MMS could become a payed service or 10gen could license it to large MongoDB users that require this data to remain in-house. Smart move.

Congrats 10gen!

Original title and link: 10gen’s MongoDB Monitoring Service: Smart Move (NoSQL database©myNoSQL)

Monitoring Riak Using Circonus

Denish Patel:

It turns out you can plug all the critical Riak Stats metrics into Circonus[1] with no effort and very little time. […] I could add all the required checks for Riak Database server under 5 minutes into Circonus!!

It is Riak to be praised here for publishing useful stats that can make an admin feel happy and in control.

  1. Circonus: SaaS performance monitoring of both business and infrastructure metrics, in Cloud and standard environments.  

Original title and link: Monitoring Riak Using Circonus (NoSQL database©myNoSQL)


Monitoring MongoDB

The guys from Boxed Ice have already published a lot about their experience running MongoDB in production, plus a series of posts advising on MongoDB monitoring. Actually they are offering a hosted service for MongoDB monitoring: Server Density .

If you still don’t have MongoDB monitoring in place—unfortunately Foursquare had to learn what this means the hard waycheck this talk from David Mytton[1].

  1. The video quality is very low and I couldn’t embed it here.  

Original title and link: Monitoring MongoDB (NoSQL database©myNoSQL)