Basic Riak MapReduce: Analyzing Apache Logs
Simon Buckle:
This article will show you how to do some Apache log analysis using Riak and MapReduce. Specifically it will give an example of how to extract URLs from Apache logs stored in Riak (the map phase) and provide a count of how many times each URL was requested (the reduce phase).
NoSQL based solutions for centralized logs analysis abound these days:
- Facebook Builds HBase-based Real-Time Analytics
- Graylog2: MongoDB-backed Syslog System
- Digg’s Story View Counts with Redis
- Using Pig Latin and Amazon Elastic Map Reduce for log analysis
- Firefox Downloads Visualization Powered by HBase
- HBase and Hadoop for data-mining on the raw logs at Infolinks
- Another Redis Use Case: Centralized Logging
- Flume and Apache logs
- Watch System Logs in Real Time with Redis Pub/Sub
And these are just a few examples I’ve been able to pull out.
Original title and link: Basic Riak MapReduce: Analyzing Apache Logs (©myNoSQL)
via: http://www.simonbuckle.com/2011/08/27/analyzing-apache-logs-with-riak/