For those who read my blog and follow my research then you know I chose MongoDB as my backend database to store my PDFs […]
Why not standard SQL? Well, I wanted the data to be returned without having to parse a blob everytime (JSON/BSON), PDF files contain a lot of data that are often unique to themselves (document based storing) and Mongo also made it easy to handle dynamic content (no columns). […] I wanted to highlight an interesting way of collecting data and answering questions about my malware using Map/Reduce.
Initially I thought that using a seach engine would be a better approach. Then I realized that not everything can be expressed with a query. When complex filtering and grouping algorithms are needed, MapReduce is the solution.
Original title and link: Malware, MongoDB and Map/Reduce: A New Analyst Approach (NoSQL databases © myNoSQL)