This article will show you how to do some Apache log analysis using Riak and MapReduce. Specifically it will give an example of how to extract URLs from Apache logs stored in Riak (the map phase) and provide a count of how many times each URL was requested (the reduce phase).
NoSQL based solutions for centralized logs analysis abound these days:
This blog is called myNoSQL and it is written by me, Alex Popescu, a software architect with a passion for open source and communities.
It records my readings, learnings, and opinions on NoSQL databases, polyglot persistence, and distributed systems -- subjects that I'm passionate about.
The opinions expressed here are my own, and no other party necessarily agrees with them.
If you feel I'm biased, I probably am.