ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

MySQL Cluster: All content tagged as MySQL Cluster in NoSQL databases and polyglot persistence

YCSB Benchmark Results for Cassandra, HBase, MongoDB, MySQL Cluster, and Riak

Put together by the team at Altoros Systems Inc., this time run in the Amazon EC2 and including Cassandra, HBase, MongoDB, MySQL Cluster, sharded MySQL and Riak:

After some of the results had been presented to the public, some observers said MongoDB should not be compared to other NoSQL databases because it is more targeted at working with memory directly. We certainly understand this, but the aim of this investigation is to determine the best use cases for different NoSQL products. Therefore, the databases were tested under the same conditions, regardless of their specifics.

Teaser: HBase got the best results in most of the benchmarks (with flush turned off though). And I’m not sure the setup included the latest HBase read improvements from Facebook.

Original title and link: YCSB Benchmark Results for Cassandra, HBase, MongoDB, MySQL Cluster, and Riak (NoSQL database©myNoSQL)

via: http://www.networkworld.com/cgi-bin/mailto/x.cgi?pagetosend=/news/tech/2012/102212-nosql-263595.html&pagename=/news/tech/2012/102212-nosql-263595.html&pageurl=http://www.networkworld.com/news/tech/2012/102212-nosql-263595.html&site=printpage&nsdr=n


MySQL Cluster Used to Implement a Highly Available and Scalable Hadoop NodeName

Given the following Hadoop NameNode problem:

the problem is, if the Namenode crashes, the entire file system becomes inoperable because clients and Datanodes still need the metadata to do anything useful. Furthermore, since the Namenode maintains all the metadata only in memory, the number of files you can store on the filesystem is directly proportional to the amount of RAM the Namenode has. As if that’s not enough, the Namenode will be completely saturated under write intensive workloads, and will be unable to respond to even simple client side queries like ls. Have a look at Shvachko’s paper which describes these problems at great length and depth, on which we’ve based our work.

Lalith Suresh has worked for the last couple of months on the following solution:

“Move all of the Namenode’s metadata storage into an in-memory, replicated, share-nothing distributed database.”

[…] We chose MySQL Cluster as our database because of its wide spread use and stability. So for the filesystem to scale to a larger number of files, one needs to add more MySQL Cluster Datanodes, thus moving the bottleneck from the Namenode’s RAM to the DB’s storage capacity. For the filesystem to handle heavier workloads, one needs to add only more Namenode machines and divide the load amongst them. Another interesting aspect is that if a single Namenode machine has to reboot, it needn’t fetch any state into memory and will be ready for action within a few seconds (although it still has to sync with Datanodes). Another advantage of our design is that the modifications will not affect the clients or Datanodes in anyway, except that we might need to find a way to divide the load among the Namenodes.

His post covers the how, but also pros and cons of his solution. And the result is available on GitHub.

Update: Hortonworks is already working on a the next generation of Apache Hadoop MapReduce which is focusing on reliability, availability, scalability, and predictable latency. But this doesn’t make Lalith’s work less interesting .

Original title and link: MySQL Cluster Used to Implement a Highly Available and Scalable Hadoop NodeName (NoSQL database©myNoSQL)

via: http://lalith.in/2011/12/15/towards-a-scalable-and-highly-available-namenode/


MySQL Sharding vs MySQL Cluster

StackExchange Q&A:

Q: Considering performance only, can a MySQL Cluster beat a custom data sharding MySQL solution?

A: I would say that MySQL Cluster could achieve higher throughput/host than sharded MySQL+InnoDB provided that :

  • Queries are simple
  • All data fits in-memory

In terms of latency, MySQL Cluster should have more stable latency than sharded MySQL. Actual latencies for purely in-memory data could be similar. As queries become more complex, and data is stored on disk, the performance comparison becomes more confusing.

Make sure you read the complete answer as it covers some more MySQL Sharding vs MySQL Cluster pros and cons.

Mat Keep

Original title and link: MySQL Sharding vs MySQL Cluster (NoSQL database©myNoSQL)


Podcast: MySQL Cluster News: Performance Improvements,New NoSQL Access

Mat Keep and Bernd Ocklin discuss what’s new in the second milesone release of MySQL Cluster 7.2: performance improvements, new NoSQL access (memcached protocol), cross data center scalability. Download the mp3.

Original title and link: Podcast: MySQL Cluster News: Performance Improvements,New NoSQL Access (NoSQL database©myNoSQL)