monitoring: All content tagged as monitoring in NoSQL databases and polyglot persistence
Thursday, 17 January 2013
How to Monitor MongoDB
A post by Pandora FMS team about monitoring options for MongoDB:
If MongoDB goes wrong, all your apps will fail. So monitoring the main variables and configuration parameters of your database is the best option to make sure that your values are right and your users are happy.
The post talks about the tools that come with MongoDB (mongotop, mongostat, the stats available through the MongoDB shell, logfiles, etc.), but also introduces their PandoraFMS library. There’s no word about 10gen’s hosted MongoDB Monitoring Service, nor other MongoDB utilities for monitoring or the latest MongoMem collection memory usage library.
In terms of what are the most interesting stats, Simon Maynard’s 5 things to Monitor in MongoDB and these other 3 metrics should be a good start.
Original title and link: How to Monitor MongoDB (©myNoSQL)
Friday, 11 January 2013
Using MetricFactory to Get Hadoop Metrics Into Graphite
Staying on the topic of monitoring, Jelle Smet writes about getting Hadoop metrics into Graphite using MetricFactory:
Using this setup we can accept Ganglia metrics over UDP from Hadoop, convert using multiple parallel processes the metrics to Graphite format in and submit the converted metrics in batches to Graphite. I’m planning to add more functionality to MetricFactory. Currently it can tackle mod_gearman and Ganglia data. Using the examples in this article you should be able to setup your own MetricFactory based setups relatively easy.
Original title and link: Using MetricFactory to Get Hadoop Metrics Into Graphite (©myNoSQL)
via: http://smetj.net/2013/01/10/using-metricfactory-to-get-hadoop-metrics-into-graphite/
Monday, 9 July 2012
Riak Metrics With Folsom
In previous versions of Riak the same process gathered metrics and calculated statistics. In certain situations, under load, reading statistics would slow down, or even timeout altogether. The call to read stats would block that same process that updates stats, leading to large message queue backlogs.
This is the sort of observation and improvement that only a product that got into production (heavy production) could make.
Original title and link: Riak Metrics With Folsom (©myNoSQL)
via: http://basho.com/blog/technical/2012/07/02/folsom-backed-stats-riak-1-2/
Wednesday, 28 March 2012
Visualizing Systems Behavior: Subsecond Offset Heat Maps
This one is for ops, devops, noops and everyone else that loves some state-of-art real-systems behavior visualizations and investigation:
The subsecond offset heat map puts time on two axes. The x-axis shows the passage of time, with each column representing one second. The y-axis shows the time within a second, spanning from 0.0s to 1.0s (time offsets). The z-axis (color) show the count of samples or events, quantized into x- and y-axis ranges (“buckets”), with the color darkness reflecting the event count (darker == more).

Speechless
Original title and link: Visualizing Systems Behavior: Subsecond Offset Heat Maps (©myNoSQL)
via: http://dtrace.org/blogs/brendan/2012/03/26/subsecond-offset-heat-maps/
Friday, 16 March 2012
Netezza Query History Table
Using Netezza’s in-database analytics package FPGROWTH, database administrators can identify the most commonly used combination of tables and the performance of the queries that reference those sets of tables.
Nice feature. Sort of the rich men’s all-included slow query log in MySQL. Do you know if other databases support a similar feature?
Original title and link: Netezza Query History Table (©myNoSQL)
Tuesday, 6 March 2012
What Can Be Learned From Heroku Outage Postmortem
While some may learn a few new things or get a confirmation in the very details of the outage, what caught my attention in the Heroku’s postmortem analysis is the conclusions:
- higher sensitivity and more aggressive monitoring on a variety of metrics
- improved early warning systems
- better containment
- improved flow controls, both manual and automatic
- expanding simulations of unusual load conditions in our staging environment
None of these are particular to a specific storage or NoSQL database. But they all reflect the reality of operating at large scale where even the most operationally friendly solutions—think of Dynamo-inspired NoSQL databases—cannot and should not be left unmonitored or unsupervised or with no clear recovery strategies and processes in place.
In the NoSQL world, one of the most covered outages was the MongoDB outage at Foursquare. And in case you don’t remember the details, most of the circumstances that led to that event could have been prevented by having:
- better monitoring
- early warnings
- better operational procedures
Aren’t these two lists looking very alike?
Original title and link: What Can Be Learned From Heroku Outage Postmortem (©myNoSQL)
Wednesday, 1 February 2012
Amazon Elastic MapReduce New Features: Metrics, Updates, VPC, and Cluster Compute Support
Starting today customers can view graphs of 23 job flow metrics within the EMR Console by selecting the Monitoring tab in the Job Flow Details page. These metrics are pushed CloudWatch every five minutes at no cost to you and include information on:
- Job flow progress including metrics on the number of map and reduce tasks running and remaining in your job flow and the number of bytes read and written to S3 and HDFS.
- Job flow contention including metrics on HDFS utilization, map and reduce slots open, jobs running, and the ratio between map tasks remaining and map slots.
- Job flow health including metrics on whether your job flow is idle, if there are missing data blocks, and if there are any dead nodes.
That’s like free pr0n for operations teams.
On a different note, I’ve noticed that the Hadoop stack (Hadoop, Hive, Pig) on Amazon Elastic MapReduce is based on second to last versions, which says that extensive testing is performed on Amazon side before rolling new versions out:
- Hadoop: 0.20.205 precursor of Hadoop 1.0.0 supports append and security, but doesn’t have RAID, symlinks or MR2
- Hive: 0.7.1 (precursor of latest 0.8.0)
- Pig: 0.9.1 (precursor of latest 0.9.2)
Original title and link: Amazon Elastic MapReduce New Features: Metrics, Updates, VPC, and Cluster Compute Support (©myNoSQL)
Sunday, 27 November 2011
10gen’s MongoDB Monitoring Service: Smart Move
You’ve probably heard of the free MongoDB monitoring service launched by 10gen: MMS, docs, and ToS.

Leaving aside that this is a useful tool for both developers and ops people, it is also a very useful tool for 10gen to monitor and understand MongoDB adoption. A hosted monitoring system will provide 10gen with good insights into what kind of workloads and data sizes MongoDB is handling, not to mention details about frequent issues MongoDB users are facing. Last, but not least, with an SLA MMS could become a payed service or 10gen could license it to large MongoDB users that require this data to remain in-house. Smart move.
Congrats 10gen!
Original title and link: 10gen’s MongoDB Monitoring Service: Smart Move (©myNoSQL)
Wednesday, 24 August 2011
Monitoring Riak Using Circonus
Denish Patel:
It turns out you can plug all the critical Riak Stats metrics into Circonus[1] with no effort and very little time. […] I could add all the required checks for Riak Database server under 5 minutes into Circonus!!
It is Riak to be praised here for publishing useful stats that can make an admin feel happy and in control.
Original title and link: Monitoring Riak Using Circonus (©myNoSQL)
via: http://slowquery.blogspot.com/2011/07/monitoring-riak-using-circonus.html
Thursday, 28 July 2011
Monitoring MongoDB
The guys from Boxed Ice have already published a lot about their experience running MongoDB in production, plus a series of posts advising on MongoDB monitoring. Actually they are offering a hosted service for MongoDB monitoring: Server Density .
If you still don’t have MongoDB monitoring in place—unfortunately Foursquare had to learn what this means the hard way—check this talk from David Mytton[1].
-
The video quality is very low and I couldn’t embed it here. ↩
Original title and link: Monitoring MongoDB (©myNoSQL)
Monday, 6 June 2011
Radish: Redis Analysis and Monitoring Service
A hosted monitoring tool for Redis:
Radish works side-by-side with your existing Redis instances in your hosting environment. All you need to do is start a daemon which handles connecting to Redis and sending data out to Radish over HTTPS.
Using Radish, you’ll be able to:
- analyze the amount of reads/writes
- track information like “average commands per second, memory usage, changes since last save, and keyspace size.” (here.)
- identify high-frequency commands and keys
Even if it looks like a very well polished product, the Hacker News community was prompt to point out that monitoring Redis using Munin or Cacti is cheaper.

Original title and link: Radish: Redis Analysis and Monitoring Service (NoSQL databases © myNoSQL)
via: http://robots.thoughtbot.com/post/6037147900/radish-dig-deep-into-redis
Monday, 10 January 2011
MongoDB db.serverStatus() Explained: MongoDB Monitoring
The BoxedIce guys are continuing their series on Monitoring MongoDB, this time explaining parts of the db.serverStatus(): connections, index counters, background flushing (nb this is the part where MongoDB can lose your data), and opcounters.
MongoDB db.serverStatus() is similar in a way to MySQL SHOW VARIABLES LIKE 'Inno%' and SHOW ENGINE [INNODB] STATUS.
These aside, I assume this whole series is just a preamble for their upcoming MongoDB monitoring service.
Original title and link: MongoDB db.serverStatus() Explained: MongoDB Monitoring (NoSQL databases © myNoSQL)
Most Popular Articles
- Translate SQL to MongoDB MapReduce
- Tutorial: Getting Started With Cassandra
- CouchDB vs MongoDB: An attempt for a More Informed Comparison
- Cassandra @ Twitter: An Interview with Ryan King
- A Couple of Nice GUI Tools for MongoDB
- NoSQL benchmarks and performance evaluations
- Ehcache: Distributed Cache or NoSQL Store?
- Document Databases Compared: CouchDB, MongoDB, RavenDB
- Quick Review of Existing Graph Databases
- NoSQL Data Modeling