usecase: All content tagged as usecase in NoSQL databases and polyglot persistence
Monday, 4 October 2010
MongoDB Arrays for Social Likes and Follows
In social sites, you generally want to like something (comment, post, page, etc), or be a friend with someone. Now, MongoDB has arrays, which can be indexed are perfect for this we found out.
While most of these scenarios will work just fine, the one that can get a bit more complicated is handling highly concurrent likes/counters.
Original title and link: MongoDB Arrays for Social Likes and Follows (NoSQL databases © myNoSQL)
via: http://www.w3matter.com/journal/2010/9/30/mongodb-arrays-for-social-likes-and-follows.html
CouchDB: Flexible Forms and Data
Flexible forms and data with CouchDB (☞ here and ☞ here)
[…] at some point i got in need of some function to serialize forms into deep json objects so that i can push whatever the form has, directly to couchdb.
First time I’ve read about something similar was in NYTimes ☞ article about using MongoDB for storing both forms and their data:
Displaying a photo submission form now requires a single lookup. The form is stored as a document in a top-level collection, and the set of custom fields become embedded documents within that.
Original title and link: CouchDB: Flexible Forms and Data (NoSQL databases © myNoSQL)
MongoDB at RServe
Why schema free is important for RServe? Because we plan to support many business type. Different business type usually come with different data on reservation. We want to support RServe user to able embed custom data in their reservation data
Basically, RServe is using the schema-less MongoDB as a form of multi-tenancy.
Another reason for using document databases is to (try to) avoid the complexity of such schemas:
Original title and link: MongoDB at RServe (NoSQL databases © myNoSQL)
via: http://blog.rserve.me/post/1177981948/the-reason-behind-nosql-and-schema-free-database
Monday, 27 September 2010
MongoDB Use Case: Archiving
Document-oriented databases, with their flexible schemas, provide a nice solution. We can have older documents which vary a bit from the newer ones in the archive. The lack of homogeneity over time may mean that querying the archive is a little harder. However, keeping the data is potentially much easier.
I think this is pushing the schema migration issue from data to code, which might actually be a good idea.
Original title and link: MongoDB Use Case: Archiving (NoSQL databases © myNoSQL)
via: http://blog.mongodb.org/post/1200539426/archiving-a-good-mongodb-use-case
Hadoop Usecase: Figting Spam in Big Data
A much more serious Hadoop use case:
Worldwide spam volumes this year are forecast to rise by 30% to 40% compared with 2009. Spam recently reached a record 92% of total email. Spammers have turned their attention to social media sites as well. In 2008, there were few Facebook phishing messages; Facebook is now the second most phished organization online. Even though Twitter has managed to recently bring its spam rate down to as low as 1%, the absolute volume of spam is still massive given its tens of millions of users. Dealing with spam introduces a number of Big Data challenges. The sheer size and scale of the data is enormous. In addition, spam in social media involves the need to understand very complex patterns of behavior as well as to identify new types of spam.
Make sure you check these 10 problems that can use Hadoop.
Original title and link: Hadoop Usecase: Figting Spam in Big Data (NoSQL databases © myNoSQL)
Thursday, 23 September 2010
Document databases: 11 Document-oriented Applications
From Zef Hemel:
Some examples of document-oriented applications:
- CRM
- Contact Address/Phone Book
- Forum/Discussion
- Bug Tracking
- Document Collaboration/Wiki
- Customer Call Tracking
- Expense Reporting
- To-Dos
- Time Sheets
- Help/Reference Desk
Looking at this list I’m like, what application is not document-oriented?
A partial answer to the last question is simple: all those that require highly connected data.
Original title and link: Document databases: 11 Document-oriented Applications (NoSQL databases © myNoSQL)
Flume Cookbook: Flume and Apache Logs
Part of the ☞ Flume cookbook:
In this post, we present a recipe that describes the common use case of using a Flume node collect Apache 2 web servers logs in order to deliver them to HDFS.
In case you want to (initially) skip Flume‘s user guide, you could start with this intro to Flume and then how does Flume and Scribe compare.
Original title and link: Flume Cookbook: Flume and Apache Logs (NoSQL databases © myNoSQL)
via: http://www.cloudera.com/blog/2010/09/using-flume-to-collect-apache-2-web-server-logs/
Tuesday, 21 September 2010
Redis at GitHub
From the InfoQ’s Werner Schuster interview with Scott Chacon:
Q: You mentioned using Redis. How do you use that?
A: We use Redis for exception handling and for our queue. We tried a lot of Ruby-based queuing mechanisms. Chris wrote an abstraction to the queuing mechanism. We used to use BJ and DJ and in the super early days we tried out Amazon SQS and a lot of queuing mechanisms and they all fell over at one point or another with the amount of traffic that we were doing on them and the types of queries that we were trying to get from them. Eventually we moved to a Redis space that Chris also wrote, called Resque.That’s open source, you can get that on GitHub, a couple of other companies you use it but it’s Redis pack. We use the Redis list and stuff to queue up jobs and to pull the jobs out of that and it’s been really solid. If you are using DJ or something and it’s not working quite well for you, then you might want to check out Rescue.
GitHub is also using Redis for configuration management. And Redis queues is already a well known usecase.
Original title and link: Redis at GitHub (NoSQL databases © myNoSQL)
Hadoop: 10 Problems That Can Use Hadoop
Mike Pearce summarizing a presentation about problems where Hadoop can be a good fit:
- Modeling True Risk
- Customer Churn Analysis
- Recommendation engines
- Ad Targeting
- Point of Sale Transaction Analysis
- Analyzing Network Data to Predict Failure
- Thread Analysis/Fraud Detection
- Trade Surveillance
- Search Quality
- Data “Sandbox”
As you can see, most of these boil down to “Aggregate Data, Score Data, Present Score As Rank”, which, at it’s simplest, is what Hadoop can do.
If you need more ideas, just check the research published on the dating site OkCupid ☞ blog.
Original title and link for this post: Hadoop: 10 Problems That Can Use Hadoop (published on the NoSQL blog: myNoSQL)
via: http://blog.mikepearce.net/2010/08/18/10-hadoop-able-problems-a-summary/
Monday, 20 September 2010
MongoDB Use Case: Site Analytics, A Reoccurring Scenario
Remember Hummingbird, the MongoDB based real time web traffic visualization tool? And Eventbrite usage of MongoDB for page views tracking. And Yottaa’s scalable event analytics backed by MongoDB? This is how you’d describe why MongoDB is a good fit for this scenario:
I want to track a bunch of data for certain kinds of views and then display custom analytics. The data collected includes a combination of request environment and internal statistics correlated with request parameters. I did not want to write this to a traditional database for every request because:
- the data is adjunct to the functionality,
- it involves a select+insert or select+update for each request and
- writes are expensive. Furthermore, the write is not critical enough to hold up the request, and definitely not worth adding a queue infrastructure.
Original title and link for this post: MongoDB Use Case: Site Analytics, A Reoccurring Scenario (published on the NoSQL blog: myNoSQL)
Monday, 6 September 2010
Extensive Riak Benchmarking at Mozilla Test Pilot
Mozilla has previously published about their detailed plan and extensive investigation into Cassandra, HBase, and Riak that led to choosing Riak. This time they are publishing some extensive Riak benchmark results (against both Riak 0.10 and Riak 0.11 running Bitcask) — they are using Riak benchmarking code, included in the list of correct NoSQL benchmarks and performance evaluations solutions. Both the results, their analysis , and interpretation are fascinating.
Our goal in running these studies was, simply put, no surprises. That meant we needed to run studies to that profiled:
- Latency
- Stability, especially for long running tests
- Performance when we introduced variable object sizes
- Performance when we introduced pre-commit hooks to evaluate incoming data
I guess Mozilla Test Pilot is one of the Riak’s most interesting case studies.
Original title and link for this post: Extensive Riak Benchmarking at Mozilla Test Pilot (published on the NoSQL blog: myNoSQL)
Wednesday, 1 September 2010
Too Much Redis?
Ben Curtis ☞ thinks that using Redis for managing friends list as described in the ☞ EngineYard post is overly complicated:
Yesterday I read a post over at the EngineYard blog about a use case for Redis (in the name of being a polyglot, trying new things, etc.), and I just had to scratch my head. I love Redis — it rocks my world — but that example was too much for me. If you just want to store a set of ids somewhere to avoid normalization headaches, introducing Redis is overkill… just do it in MySQL!
He goes on and proposes a MySQL solution in which friends IDs are serialized as a comma separated list. Frankly speaking, I do see quite a few advantages Redis has compared to this one:
- Redis knows how to handle sets
- you don’t have to deal with de-duplication
- (most probably) the storage is optimized
- with manual serialization you’ll have to deal with all concurrency issues occurring when updating these lists
So what is the advantage of Ben’s suggested solution?
Original title and link for this post: Too Much Redis? (published on the NoSQL blog: myNoSQL)
Most Popular Articles
- Translate SQL to MongoDB MapReduce
- Tutorial: Getting Started With Cassandra
- CouchDB vs MongoDB: An attempt for a More Informed Comparison
- Cassandra @ Twitter: An Interview with Ryan King
- A Couple of Nice GUI Tools for MongoDB
- NoSQL benchmarks and performance evaluations
- Ehcache: Distributed Cache or NoSQL Store?
- Document Databases Compared: CouchDB, MongoDB, RavenDB
- Quick Review of Existing Graph Databases
- NoSQL Data Modeling
