scalability: All content tagged as scalability in NoSQL databases and polyglot persistence
Wednesday, 10 April 2013
Paper: An Analysis of Linux Scalability to Many Cores
A paper authored by a team from MIT CSAIL whose goal is to identify various scalability issues in the Linux kernel:
This paper analyzes the scalability of seven system applications (Exim, memcached, Apache, PostgreSQL, gmake, Psearchy, and MapReduce) running on Linux on a 48- core computer. Except for gmake, all applications trigger scalability bottlenecks inside a recent Linux kernel. Us-ing mostly standard parallel programming techniques—this paper introduces one new technique, sloppy counters—these bottlenecks can be removed from the kernel or avoided by changing the applications slightly. Modifying the kernel required in total 3002 lines of code changes. A speculative conclusion from this analysis is that there is no scalability reason to give up on traditional operating system organizations just yet.
Interesting choice of tools. Note that the team used an in-memory file system to eliminate the disk-related bottlenecks.
Original title and link: Paper: An Analysis of Linux Scalability to Many Cores (©myNoSQL)
via: http://www.stanford.edu/class/cs240/readings/analysis-linux-scalability.pdf
Monday, 11 February 2013
How Do I Freaking Scale Oracle?
Andrew Oliver for InfoWorld:
That said, many companies I work with have spent 20 years painting themselves into an Oracle corner. While they may have one eye on a brighter future, they still must ensure their Oracle database is high-performance and highly available — and scales as well as possible. Despite what you may read in NoSQL vendor marketing materials (or even in my blog [2]), it is possible to scale Oracle.
If you can actually find the answer to the question in the title, please teach me or give me links. This article left me in the dark. Or the only light I’ve seen involves way too many additional products that most probably cost a bit.
Original title and link: How Do I Freaking Scale Oracle? (©myNoSQL)
Thursday, 24 May 2012
The Myth of Auto Scaling as a Capacity Planning Approach
A quite old, but very educative post by James Golick dissecting the mythical extra server capacity:
There’s this idea floating around that we can scale out our data services “just in time”. Proponents of cloud computing frequently tout this as an advantage of such a platform. Got a load spike? No problem, just spin up a few new instances to handle the demand. It’s a great sounding story, but sadly, things don’t quite work that way.
This is the Mythical Man-Month of the IT department.
Original title and link: The Myth of Auto Scaling as a Capacity Planning Approach (©myNoSQL)
via: http://jamesgolick.com/2010/10/27/we-are-experiencing-too-much-load-lets-add-a-new-server..html
Monday, 26 March 2012
Two Sides of the OMGPOP Cloud and Couchbase Scalability Story
Many media sites published on Friday the PR release of OMGPOP growth story citing the usage of cloud services and Couchbase as their scaling solution (GigaOm, BusinessInsider).
When reading it, I’ve jotted down:
- The good: using a combination of cloud and a NoSQL database (Couchbase) allowed OMGPOP to scale
- The bad: OMGPOP had to call in people from Couchbase to help out with scaling
Question is if you can throw in more iron and hire experts wouldn’t many other database solutions be able to cope with OMGPOP’s growth?
Original title and link: Two Sides of the OMGPOP Cloud and Couchbase Scalability Story (©myNoSQL)
Wednesday, 8 February 2012
What other popular paradigms/architectures can handle large scale computational problems?
Interesting answers on Quora mostly expanding on Krishna Sankar’s short answer:
There are two ways one can address large scale computational problems:
- Task Parallelism : This is where MPI and so forth fit in
- Data Parallelism : This is the sweet spot for map/reduce
Original title and link: What other popular paradigms/architectures can handle large scale computational problems? (©myNoSQL)
Thursday, 19 January 2012
Auto Scaling in the Amazon Cloud: Netflix's Approach and Lessons Learned
Another great post for today from the engineering team at Netflix:
Auto scaling is a very powerful tool, but it can also be a double-edged sword. Without the proper configuration and testing it can do more harm than good. A number of edge cases may occur when attempting to optimize or make the configuration more complex. As seen above, when configured carefully and correctly, auto scaling can increase availability while simultaneously decreasing overall costs.
Original title and link: Auto Scaling in the Amazon Cloud: Netflix’s Approach and Lessons Learned (©myNoSQL)
via: http://techblog.netflix.com/2012/01/auto-scaling-in-amazon-cloud.html
Tuesday, 17 January 2012
Asking for Performance and Scalability Advice on StackOverflow
How many times have you got an answer that applies to your specific scenario when providing a short list of performance and scalability requirements? MySQL/InnoDB can do 750k qps, Cassandra is scaling linearly, MongoDB can do 8 mil ops/s. Is any of these the answer for your application?
Actually:
-
How many times did you get all the requirements right at the spec time?
-
How many times did requirements remain the same during the development cycle?
-
How many times did production reality confirmed your bullet list requirements?
Original title and link: Asking for Performance and Scalability Advice on StackOverflow (©myNoSQL)
Friday, 25 November 2011
Podcast: MySQL Cluster News: Performance Improvements,New NoSQL Access
Mat Keep and Bernd Ocklin discuss what’s new in the second milesone release of MySQL Cluster 7.2: performance improvements, new NoSQL access (memcached protocol), cross data center scalability. Download the mp3.
Original title and link: Podcast: MySQL Cluster News: Performance Improvements,New NoSQL Access (©myNoSQL)
Monday, 3 October 2011
The Story of Etsy's Architecture
Ars Technica’s Sean Gallagher summarizes a presentation given at Surge conference covering the evolution of Etsy’s architecture from a centralized PostgreSQL stored procedures based solution to a sharded MySQL and going through a failed service oriented-like architecture:
And the team started to shift feature by feature away from a semi-monolithic Postgres back-end to sharded MySQL databases. “It’s a battle-tested approach,” Snyder said. “Flickr is using it on an enormous scale. It scales horizontally, basically, to near infinity, and there’s no single point of failure—it’s all master to master replication.”
Original title and link: The Story of Etsy’s Architecture (©myNoSQL)
Tuesday, 6 September 2011
Help CouchDB Break the C10K Barrier
Over the weekend, I was experimenting with CouchDB to see if it can pass the C10K barrier. Some of the performance optimizations I made along the way are really OS-level optimizations that affect MochiWeb (erlang web server) and fairly well documented in many blogs. This one by @metabrew in particular is a pretty good read, since it focuses on Erlang and MochiWeb. While I am a performance junkie, I am not an Erlang hacker. So this is a call for help to the CouchDB hackers for recommendations on scaling out CouchDB.
The initial tweaks made by the blitz.io guys, took CouchDB from under 1000 concurrent users to around 2300 concurrent users. There’s still a long way to 10k concurrent users and they’d appreciate your help.
Original title and link: Help CouchDB Break the C10K Barrier (©myNoSQL)
via: http://blog.mudynamics.com/2011/09/05/help-couchdb-break-the-c10k-barrier/
Monday, 8 August 2011
What Scales Best?
Tony Bain:
What is best? Well that comes down to the resulting complexity, cost, performance and other trade-offs. Trade-offs are key as there are almost always significant concessions to be made as you scale up.
[…]
So what is my point? Well I guess what I am saying is physical scalability is of course an important consideration in determining what is best. But it is only one side of the coin. What it “costs” you in terms of complexity, actual dollars, performance, flexibility, availability, consistency etc, etc are all important too. And these are often relative, what is complex for you may not be complex for someone else.
I concur—a long time ago I wrote: Complexity is a dimension of scalability.
Original title and link: What Scales Best? (©myNoSQL)
via: http://blog.tonybain.com/tony_bain/2011/07/what-scales-best.html
Most Popular Articles
- Translate SQL to MongoDB MapReduce
- Tutorial: Getting Started With Cassandra
- CouchDB vs MongoDB: An attempt for a More Informed Comparison
- Cassandra @ Twitter: An Interview with Ryan King
- A Couple of Nice GUI Tools for MongoDB
- NoSQL benchmarks and performance evaluations
- Ehcache: Distributed Cache or NoSQL Store?
- Document Databases Compared: CouchDB, MongoDB, RavenDB
- Quick Review of Existing Graph Databases
- NoSQL Data Modeling