ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Membase Amazon SimpleDB MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

scalability: All content tagged as scalability in NoSQL databases and polyglot persistence

What other popular paradigms/architectures can handle large scale computational problems?

Interesting answers on Quora mostly expanding on Krishna Sankar’s short answer:

There are two ways one can address large scale computational problems:

  • Task Parallelism : This is where MPI and so forth fit in
  • Data Parallelism : This is the sweet spot for map/reduce

Original title and link: What other popular paradigms/architectures can handle large scale computational problems? (NoSQL database©myNoSQL)


Auto Scaling in the Amazon Cloud: Netflix's Approach and Lessons Learned

Another great post for today from the engineering team at Netflix:

Auto scaling is a very powerful tool, but it can also be a double-edged sword. Without the proper configuration and testing it can do more harm than good. A number of edge cases may occur when attempting to optimize or make the configuration more complex. As seen above, when configured carefully and correctly, auto scaling can increase availability while simultaneously decreasing overall costs.

Original title and link: Auto Scaling in the Amazon Cloud: Netflix’s Approach and Lessons Learned (NoSQL database©myNoSQL)

via: http://techblog.netflix.com/2012/01/auto-scaling-in-amazon-cloud.html


Asking for Performance and Scalability Advice on StackOverflow

How many times have you got an answer that applies to your specific scenario when providing a short list of performance and scalability requirements? MySQL/InnoDB can do 750k qps, Cassandra is scaling linearly, MongoDB can do 8 mil ops/s. Is any of these the answer for your application?

Actually:

  • How many times did you get all the requirements right at the spec time?

  • How many times did requirements remain the same during the development cycle?

  • How many times did production reality confirmed your bullet list requirements?

Original title and link: Asking for Performance and Scalability Advice on StackOverflow (NoSQL database©myNoSQL)


Podcast: MySQL Cluster News: Performance Improvements,New NoSQL Access

Mat Keep and Bernd Ocklin discuss what’s new in the second milesone release of MySQL Cluster 7.2: performance improvements, new NoSQL access (memcached protocol), cross data center scalability. Download the mp3.

Original title and link: Podcast: MySQL Cluster News: Performance Improvements,New NoSQL Access (NoSQL database©myNoSQL)


The Story of Etsy's Architecture

Ars Technica’s Sean Gallagher summarizes a presentation given at Surge conference covering the evolution of Etsy’s architecture from a centralized PostgreSQL stored procedures based solution to a sharded MySQL and going through a failed service oriented-like architecture:

And the team started to shift feature by feature away from a semi-monolithic Postgres back-end to sharded MySQL databases. “It’s a battle-tested approach,” Snyder said. “Flickr is using it on an enormous scale. It scales horizontally, basically, to near infinity, and there’s no single point of failure—it’s all master to master replication.”

Original title and link: The Story of Etsy’s Architecture (NoSQL database©myNoSQL)

via: http://arstechnica.com/business/news/2011/10/when-clever-goes-wrong-how-etsy-overcame-poor-architectural-choices.ars


Help CouchDB Break the C10K Barrier

Over the weekend, I was experimenting with CouchDB to see if it can pass the C10K barrier. Some of the performance optimizations I made along the way are really OS-level optimizations that affect MochiWeb (erlang web server) and fairly well documented in many blogs. This one by @metabrew in particular is a pretty good read, since it focuses on Erlang and MochiWeb. While I am a performance junkie, I am not an Erlang hacker. So this is a call for help to the CouchDB hackers for recommendations on scaling out CouchDB.

The initial tweaks made by the blitz.io guys, took CouchDB from under 1000 concurrent users to around 2300 concurrent users. There’s still a long way to 10k concurrent users and they’d appreciate your help.

Original title and link: Help CouchDB Break the C10K Barrier (NoSQL database©myNoSQL)

via: http://blog.mudynamics.com/2011/09/05/help-couchdb-break-the-c10k-barrier/


What Scales Best?

Tony Bain:

What is best?  Well that comes down to the resulting complexity, cost, performance and other trade-offs.  Trade-offs are key as there are almost always significant concessions to be made as you scale up.

[…]

So what is my point? Well I guess what I am saying is physical scalability is of course an important consideration in determining what is best. But it is only one side of the coin. What it “costs” you in terms of complexity, actual dollars, performance, flexibility, availability, consistency etc, etc are all important too. And these are often relative, what is complex for you may not be complex for someone else.

I concur—a long time ago I wrote: Complexity is a dimension of scalability.

Original title and link: What Scales Best? (NoSQL database©myNoSQL)

via: http://blog.tonybain.com/tony_bain/2011/07/what-scales-best.html


The Server Architecture Debate Rages On

Big processors or little processors, scale-up or scale-out, on-premise or in the cloud […] The plethora of choices for application architecture and delivery model are great if you like variety, but I don’t envy anyone tasked with choosing which system on which to spend their limited budget dollars.

Too little options is bad[1]. Too many options are paralizing[2]. Then what’s the solution? I think the only answer is to build experience. By trying, failing, learning, and sharing with everyone else.

Original title and link: The Server Architecture Debate Rages On (NoSQL database©myNoSQL)

via: http://gigaom.com/cloud/the-server-architecture-debate-rages-on/


The NoSQL Fad

Adam D’Angelo[1]:

I think the “NoSQL” fad will end when someone finally implements a distributed relational database with relaxed semantics.

I believe that defining these relaxed semantics will actually lead to figuring out the origins of many of the NoSQL solutions—just as an example, relaxing the relational model would lead to options like the document model or the BigTable-like columnar model.


  1. Adam D’Angelo: Quora Founder  

Original title and link: The NoSQL Fad (NoSQL database©myNoSQL)

via: http://www.quora.com/Why-does-Quora-use-MySQL-as-the-data-store-rather-than-NoSQLs-such-as-Cassandra-MongoDB-CouchDB-etc


HBase Load Balancing Explained

Ted Yu explains the internals of the HBase load balancing with references to corresponding JIRA tickets and the latest improvements:

If at least one region server joined the cluster just before the current balancing action, both new and old regions from overloaded region servers would be moved onto underloaded region servers. Otherwise, I find the new regions and put them on different underloaded servers. Previously one underloaded server would be filled up before the next underloaded server is considered.

I am planning for the next generation of load balancer where request histogram would play an important role in deciding which regions to move.

HBase load balancing has also been discussed in this (older) conversation on the mailing list.

Original title and link: HBase Load Balancing Explained (NoSQL databases © myNoSQL)

via: http://zhihongyu.blogspot.com/2011/04/load-balancer-in-hbase-090.html