NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



MySQL: All content tagged as MySQL in NoSQL databases and polyglot persistence

Redis Mass Data Import: MySQL to Redis in One Step

Derek Watson:

In moving a relatively large table from MySQL to Redis, you may find that extracting, transforming and loading a row at a time can be excruciatingly slow. Here’s a quick trick you can use that pipes the output of the mysql command directly to redis-cli, bypassing middleware and allowing both data stores to operate at their peak speed.

Nice trick. Which by the way was documented on the Redis site. With some more work (worth for larger data sets), you could actually generate a Redis RDB file directly.

Original title and link: Redis Mass Data Import: MySQL to Redis in One Step (NoSQL database©myNoSQL)


Three Analyst Predictions for 2013: Hadoop, SAP, and MySQL vs NoSQL

The season of predictions is here. Chris Kanaracus in an all-bold post, quoting analysts:

Jon Reed: “Expect SAP to purchase an up-and-coming “big data” product or vendor, and perhaps several, including at least one that specializes in integration with the Hadoop framework for large-scale data processing”.

I’m still scratching my head to come up with the long list of product or vendors specialized in integration of Hadoop that SAP could acquire.

Curt Monash: “Expect plenty of additional adoption for Hadoop. Everybody has the ‘big bit bucket’ use case, largely because of machine-generated data. Even today’s technology is plenty good enough for that purpose, and hence justifies initial Hadoop adoption.”


What I hope to see happening is that besides the companies putting together the building blocks to make Hadoop friendly enough (real work) and the companies claiming integration with Hadoop (not that fantastic work), there’ll be some companies that take the Hadoop stack and built tools whose immediate impact on the business can be measured. Basically vertical solutions applying the Hadoop stack to specific markets, segments, and scenarios.

The main challenge of “Big Data” these days is not that there isn’t value behind it. It’s the measurability of this value. What each company looking into Big Data tries to answer is what value does big data carry for my case? This is a founded question as not every company has an infinite budget, time, and magic resource pool.

Curt Monash: “Usually when the topic of alternative databases comes up, the incumbent is often Oracle or IBM DB2. But in 2013, MySQL could be playing the latter role. NoSQL and NewSQL products often are developed as MySQL alternatives.

Until now NoSQL companies have understood that the competition is not with each. The huge market that relational databases have it covered has enough potential to welcome a few solid NoSQL solutions and there’s no long term need to fight over the few people that already paid attention to them.

Make your bets.

Original title and link: Three Analyst Predictions for 2013: Hadoop, SAP, and MySQL vs NoSQL (NoSQL database©myNoSQL)

Diff Tools for MySQL Configurations

From Webyog blog two tools to compare and diff both the static and runtime configurations of MySQL servers:

  1. pt-config-diff part of the Percona toolkit (free)
  2. MONyog visual diff and configuration tracker (commercial)

Original title and link: Diff Tools for MySQL Configurations (NoSQL database©myNoSQL)


Test Driving Database Indexes

Myron Marston:

Database indexes are conceptually very simple, but in practice, I’ve found that it’s hard to predict when they’ll get used and what indexes a given table needs. On a project at work I came up with the idea to test-drive my database indexes, just like I test-drive the rest of my code. I’d like to share the approach I came up with.

A very interesting idea at least for MySQL users.

Original title and link: Test Driving Database Indexes (NoSQL database©myNoSQL)


Provisioned IOPS for Amazon RDS

Werner Vogels:

Following the huge success of being able to provision a consistent, user-requested I/O rate for DynamoDB and Elastic Block Store (EBS), the AWS Database Services team has now released Provisioned IOPS, a new high performance storage option for the Amazon Relational Database Service (Amazon RDS). Customers can provision up to 10,000 IOPS (input/output operations per second) per database instance to help ensure that their databases can run the most stringent workloads with rock solid, consistent performance.

Amazon is the first company I know of championing guaranteed performance SLAs. Until recently most of the SLAs were referring to availability, resilience, and redundancy. But soon performance-based SLAs will become the norm for other service providers. I’d also expect appliance vendors to be asked for similar guarantees sooner than later.

Original title and link: Provisioned IOPS for Amazon RDS (NoSQL database©myNoSQL)


Big Data at Aadhaar With Hadoop, HBase, MongoDB, MySQL, and Solr

It’s unfortunate that the post focuses mostly on the usage of Spring and RabitMQ and the slidedeck doesn’t dive deeper into the architecture, data flows, and data stores, but the diagrams below should give you an idea of this truly polyglot persistentency architecture:

Architecture of Big Data at Aadhaar

Big Data at Aadhaar Data Stores

The slide deck presenting architecture principles and numbers about the platform after the break.

A Quick Test of the New MySQL Memcached Plugin With (J)Ruby

Gabor Vitez:

With a new post hitting Hacker News again on MySQL’s memcached plugin, I really wanted to do a quick-and-dirty benchmark on it, just to see what good it is – does this interface offer any extra speed when compared to SQL+ActiveRecord? Does it have it’s place in the software stack? How much work is needed to get this combination off the ground?

When running into micro-benchmarks, I really try my best to figure out if they hide any value. But it’s hard to find any in one that uses different libraries, runs both the benchmark and the server on the same machine and uses no concurrency.

Original title and link: A Quick Test of the New MySQL Memcached Plugin With (J)Ruby (NoSQL database©myNoSQL)


Klout Data Architecture: MySQL, HBase, Hive, Pig, Elastic Search, MongoDB, SSAS

Just found slideck (embedded below) describing the data workflow at Klout. Their architecture includes many interesting pieces combining both NoSQL and relational databases with Hadoop and Hive and Pig and traditional BI. Even Excel gets a mention in the slides:

  1. Pig and Hive
  2. HBase
  3. Elastic Search
  4. MongoDB
  5. MySQL

Klout Data Architecture

Generating Meaningful Test Data Using a MySQL Function

Ronald Speelman:

You can use this MySQL function to generate names, (e-mail)addresses, phone numbers, urls, bit values, colors, IP address, etc.. As usual, the code is provided in a zipfile and the code is fully documented.

The last couple of days I’ve been looking for generating some good test data in JSON format1, so if you are aware of something please drop me a note.

  1. Right now I’m using as input a corpus of combines JSON files I’ve found online, but I’m not happy with the solution. 

Original title and link: Generating Meaningful Test Data Using a MySQL Function (NoSQL database©myNoSQL)


MySQL Is Bazillion Times Faster Than MemSQL

Domas Mituzas about the MemSQL vs MySQL benchmark:

Though I usually understand that those claims don’t make any sense, I was wondering what did they do wrong. Apparently they got MySQL with default settings running and MemSQL with default settings running, then compared the two. They say it is a good benchmark, as it compares what users get just by installing standard packages.

That is already cheating, because systems are forced to work in completely different profiles.

The first paragraph of the post summarizes very well the general feeling about benchmarks:

I don’t like stupid benchmarks, as they waste my time.

I think that most of the generic benchmarks are stupid, even if some generic numbers are considered interesting by software engineers. Benchmarks designed around specific scenarios of applications will most of the time give more realistic results. But even those are difficult to design and account for all the configuration options, scaling, or changes of the use cases.

Original title and link: MySQL Is Bazillion Times Faster Than MemSQL (NoSQL database©myNoSQL)


A Tragically Comedic Security Flaw in MySQL

In short, if you try to authenticate to a MySQL server affected by this flaw, there is a chance it will accept your password even if the wrong one was supplied. The following one-liner in bash will provide access to an affected MySQL server as the root user account, without actually knowing the password.

  $ for i in `seq 1 1000`; do mysql -u root --password=bad -h 2>/dev/null; done

Don’t try this at home. Or if you try it, don’t tell anyone the result.

Original title and link: A Tragically Comedic Security Flaw in MySQL (NoSQL database©myNoSQL)


Tumblr Jetpants: A Toolkit for Huge MySQL Topologies


Today, we’re happy to announce the open source release of Jetpants, Tumblr’s in-house toolchain for managing huge MySQL database topologies. Jetpants offers a command suite for easily cloning replicas, rebalancing shards, and performing master promotions. It’s also a full Ruby library for use in developing custom billion-row migration scripts, automating database manipulations, and copying huge files quickly to multiple remote destinations.

This toolkit helps Tumblr manage its 60bn rows totaling 21TB of data in a MySQL cluster.

Original title and link: Tumblr Jetpants: A Toolkit for Huge MySQL Topologies (NoSQL database©myNoSQL)