Posted on the MongoDB mailing list:
I have about 500M log file entries each representing an “ad impression” (we are an advertising company). Each “hit” has about 50 attributes to it (example: Country, State, City, Adsize, Browser, OS, etc) .. I want to load all 500M into some form of database and then run queries against this set.
As you could expect MongoDB is considered as a possibility. But I’d call that a biased vendor advise. I’ll be blunt: invest in your future by using Hadoop and Pig. Hive may fit too.
Original title and link for this post: MongoDB or Hadoop? (published on the NoSQL blog: myNoSQL)