NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



The NoSQL Family Tree


Even if it includes just a handful of NoSQL databases, it’s still a nice visualization.

Original title and link: The NoSQL Family Tree (NoSQL database©myNoSQL)


Examples of analytics applications across industries

A great matrix of the different analytics use cases across industries in Hortonworks’s post “Enterprise Hadoop and the Journey to a Data Lake“:

Anaylitcs use cases

The data type column section covers multiple dimensions of data. And the authors took a conservative approach for the structured and unstructured categories (in the sense that they marked very few categories as unstructured).

A couple of interesting exercises that can be done using this matrix as an input:

  1. figure out how adding data from different categories to a specific use case would benefit it. One obvious example is: how would Telecom companies benefit from adding to their infrastructure analysis social data?

    Building on the above, decide what tools exist to help with this extra scenario.

  2. can one use case from an industry be applied to a different industry to disrupt it?

    What would be the quickest road to accomplish it?

Original title and link: Examples of analytics applications across industries (NoSQL database©myNoSQL)

Cassandra hits 1 million writes per second on Google Compute Engine

Google using Cassandra to show the performance and cost efficiency of the Google Compute Engine:

  • sustain one million writes per second to Cassandra with a median latency of 10.3 ms and 95% completing under 23 ms
  • sustain a loss of 1/3 of the instances and volumes and still maintain the 1 million writes per second (though with higher latency)
  • scale up and down linearly so that the configuration described can be used to create a cost effective solution
  • go from nothing in existence to a fully configured and deployed instances hitting 1 million writes per second took just 70 minutes. A configured environment can achieve the same throughput in 20 minutes.

Make sure you check the charts and get to the conclusion part. The other conclusion I’d suggest is: based on the real benchmarks I’ve seen over the years, Cassandra is the only system that was proven to scale lineary and provide top performance1.

  1. Before saying that I’m biased, make sure you are reading at least this story and Netflix’s post

Original title and link: Cassandra hits 1 million writes per second on Google Compute Engine (NoSQL database©myNoSQL)


4 Reasons Perfect Market chose MongoDB

A team from Perfect Market about choosing MongoDB for their Digital Publishing Suite:

There are many NoSQL products out there, why did we bet on MongoDB? There are four major reasons: great performance, great features, ease of use and great support. Of course not every day with MongoDB is a sunshine day. Some tradeoffs we made are shared at the end of this post.

  1. I’m sure Perfect Market would get great support from almost every NoSQL database vendor — that’s what I’ve always heard in this market segment.
  2. By great performance I’ll assume Perfect Market got the numbers they needed. While presented as the top reason for choosing MongoDB, I think this was more in line with: “considering these other features, is MongoDB’s performance good enough for us?”.

    MongoDB is not the fastest NoSQL database.

  3. Great features and ease of use. Nobody can deny that, at least at the first glance, MongoDB’s feature set is very compelling. And they’ve absolutely nailed the user experience part.

    My hypothesis for MongoDB’s adoption rate has always been that it’s mostly due to it looking familiar to people with relational db experience and also removing most of the strict constraints of these. This is echoed in this post too:

    Althought MongoDB is a NoSQL document DBMS, it bears resemblance to RDBMS’s.

Original title and link: 4 Reasons Perfect Market chose MongoDB (NoSQL database©myNoSQL)


VoltDB raises $8M in Series B


VoltDB has raised $8 million from Sigma Ventures, Kepha Partners and three other “strategic investors”, bringing total venture capital investment to $18.7 million, said its CEO, Bruce Reading. Sigma and Kepha participated in an earlier round, in 2012, through which it raised $5.7 million.

I assume some will say it’s a small round. I’ll say congrats to the VoltDB team.

Original title and link: VoltDB raises $8M in Series B (NoSQL database©myNoSQL)


Bloomberg says Cloudera raises at least $200m in new round

Dina Bass and Serena Saitto (Bloomberg):

Cloudera Inc. is raising at least $200 million in a new round of financing from investors including Intel Corp., according to people with knowledge of the situation.

Not confirmed yet.

Original title and link: Bloomberg says Cloudera raises at least $200m in new round (NoSQL database©myNoSQL)


A simple distributed algorithm for small idempotent information

In this blog post I’m going to describe a very simple distributed algorithm that is useful in different programming scenarios. The algorithm is useful when you need to take some kind of information synchronized among a number of processes. The information can be everything as long as it is composed of a small number of bytes, and as long as it is idempotent, that is, the current value of the information does not depend on the previous value, and we can just replace an old value, with the new one.

While reading this post from Salvatore Sanfilippo all I was visualizing were the diagrams in James Micken’s “The saddest moment” paper.

Original title and link: A simple distributed algorithm for small idempotent information (NoSQL database©myNoSQL)


2014 State Of Database Tech: Think Retro

Joe Masters Emison for InformationWeek:

Today’s database landscape isn’t just static. It’s positively retro. Remember 2004? Facebook had just launched, the iPad wasn’t even a twinkle in Steve Jobs’ eye, and Gartner’s database market share report put IBM (34.1%), Oracle (33.7%), and Microsoft (20%) in the top spots. In our survey, Microsoft, Oracle, and IBM still hold the top spots; we do add MySQL, but that’s about it for innovation. […]

And those relational databases from Microsoft, Oracle, and IBM? They’re essentially just updated versions of the companies’ 2004 offerings.

You’ll see these numbers in many surveys. But there are a couple of things to keep in mind while reading them:

  1. the enterprise world is well-known to be a late adopter. A very late adopter actually.
  2. many of these databases are subscription based so customers are locked-in on at least an yearly basis
  3. many of these databases have been acquired together with hardware and consultancy/support. Another type of lock-in.
  4. none of these databases is showing the growth in demand, jobs, and revenue that the top NoSQL databases are seeing for the last 12-18 months.

When you already bought a house, it’s quite difficult to go out looking for a new one. But there’s no good reason for you not to look and get the best appliances and furniture for your house.

Original title and link: 2014 State Of Database Tech: Think Retro (NoSQL database©myNoSQL)

April 3 Webinar: The BlueKai Playbook for Scaling to 10 Trillion Transactions a Month [sponsor]

myNoSQL’s supporter Aerospike is getting ready for a new case study webinar:

As the industry’s largest online data exchange, BlueKai knows a thing or two about pushing the limits of scale. Find out how they are processing up to 10 trillion transactions per month from Vice President of Data Delivery, Ted Wallace.

Register today.

Original title and link: April 3 Webinar: The BlueKai Playbook for Scaling to 10 Trillion Transactions a Month [sponsor] (NoSQL database©myNoSQL)

NoSQL vendor Basho restaffs barren executive ranks

Joab Jackson (Computerworld) quoting the newly appointed CEO of Basho, Adam Wray:

The previous leadership was intelligent and brought a lot of skills to the table, but have not run companies of this size and with this meteoric growth.


Quite frankly, some people were just getting tired, and needed a change in venue.

Many founders and engineers talk about their companies and products as their babies. And there’s a very difficult decision to make to step away for the benefit of your babies when realizing that there may be better ways to raise them.

Maybe that’s the case with what’s happening at Basho.

Original title and link: NoSQL vendor Basho restaffs barren executive ranks (NoSQL database©myNoSQL)


End of an era at Basho

Jack Clark for The Register:

The Register has learned that CTO Justin Sheehy will soon be off to pastures unknown; chief architect Andy Gross today revealed he is leaving to take up a position at Twitter; and CEO Greg Collins left in January.

This marks the end of an era at Basho. Only the future, if there’s one, will tell if this was a good or bad era.

Original title and link: End of an era at Basho (NoSQL database©myNoSQL)


A Couchbase stack for under $1000

In this article we are going to look at how you can build an awesome cloud based solution with a lot of headroom and power for Couchbase for under $1000!

Getting 8 servers (2 reverse proxies, 2 app servers, 4 database nodes) for this money sounds like a sweet deal.

Original title and link: A Couchbase stack for under $1000 (NoSQL database©myNoSQL)