NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



RDBMS: All content tagged as RDBMS in NoSQL databases and polyglot persistence

Intro to NoSQL Databases: What's Wrong with RDBMSs?

  1. RDBMSs use a table-based normalization approach to data, and that’s a limited model. Certain data structures cannot be represented without tampering with the data, programs, or both.
  2. They allow versioning or activities like: Create, Read, Update and Delete. For databases, updates should never be allowed, because they destroy information. Rather, when data changes, the database should just add another record and note duly the previous value for that record.
  3. Performance falls off as RDBMSs normalize data. The reason: Normalization requires more tables, table joins, keys and indexes and thus more internal database operations for implement queries. Pretty soon, the database starts to grow into the terabytes, and that’s when things slow down.

Reality check: 1 is fine, 3 is about joins, and not about keys, indexes, and tables, but 2 is really puzzling.

Original title and link: Intro to NoSQL Databases: What’s Wrong with RDBMSs? (NoSQL databases © myNoSQL)


The Disruptive Value of Distributed Key-Value Stores

Martin Schneider (Basho):

Organizations with specific needs best met by a platform like Riak could save a company:

  • Millions of dollars in oracle license/maintenance
  • Hundreds of thousands a year in BI system license/maintenance
  • Up to hundreds of thousands in sys-admin salary/overhead

This sounds correct in theory. But the last couple of Oracle databases I’ve seen were:

  1. serving multiple applications
  2. sharing data between applications
  3. used for generating tens/hundreds of reports

So, the online storage/OLTP costs equation to beat is:

licenses + operational costs + “data integration costs” + etl + reporting « licenses + operational costs

Original title and link: The Disruptive Value of Distributed Key-Value Stores (NoSQL databases © myNoSQL)


Yet NoSQL Dogma

Given all this I did some benchmarks and as expected the NOSQL community was hurt […]

He is right about my comments not being clear. Let me try to clarify.

  1. If all you need is MySQL minus the relational model using the HandlerSocket for key-value access that’s OK.

  2. If you are ready to give up the relational model and turn your database into a key-value store, then doors are open to try using other data models: document database, key-value stores, etc.

    The fact that MySQL can be used as a key-value store is a good thing, but not looking around to alternatives is dogmatic. Other storage solutions might actually offer better alternatives (performance, scalability, data model richness, development agility, etc.).

  3. If you need to build a distributed system, there are solutions out there that can maintain either ACID characteristics (like VoltDB) or that can go fully scalable (like Cassandra, HBase, Riak, etc.).

So, what is my point? If your relational database works for you that’s perfect. If it doesn’t, you have alternatives. Just look around and pick something that solves the problem. We are in a polyglot persistence world. We have options. We provide solutions and move along.

Original title and link: Yet NoSQL Dogma (NoSQL databases © myNoSQL)


Cloud Foundry, NoSQL Databases, and Polyglot Persistence

VMWare’s Cloud Foundry has the potential to become the preferred PaaS solution. It bundles together a set of services that it took years for other PaaS providers (Google App Engine, Microsoft Azure) to offer. And it seems that Cloud Foundry has much less (or none at all) vendor lock in[1].

From a storage perspective, Cloud Foundry is encouraging polyglot persistence right from the start offering access to a relational database (MySQL), a super-fast smart key-value store (Redis), and a popular document database (MongoDB). The only bit missing is a graph database[2].

I think the first graph database to get there will see an immediate bump in its adoption.

  1. These comments are based on what I’ve read about VMWare CloudFoundry as I haven’t received (yet) my invitation.  

  2. I don’t think wide-column databases (Cassandra, HBase) are fit for PaaS  

Original title and link: Cloud Foundry, NoSQL Databases, and Polyglot Persistence (NoSQL databases © myNoSQL)

Scaling an RDBMS in 6 Steps

From Gavin Heavyside’s slides:

  • Launch successful service
  • Read saturation: add caching
  • Write saturation: add hardware
  • Queries slow down: denormalize
  • Reads still too slow: prematerialise common queries, stop joining
  • Writes too slow: drop secondary indexes and triggers

Scaling an RDBMS in 6 Steps

Original title and link: Scaling an RDBMS in 6 Steps (NoSQL databases © myNoSQL)

RDBMS Shortcomings and Band-Aids or What Lead to NoSQL

James Phillips’[1] version of what lead to NoSQL:

RDBMS: shortcomings and band-aids

  • It remains a centralized, “scale-up” technology; runs on complex, proprietary, expensive servers; and handling more users requires getting bigger (and even more expensive) servers (for increased CPU, memory and I/O capacity).
  • Running RDBMS technology in an otherwise distributed architecture highlights its lack of flexibility for “rightsizing” the database in real time to fit the needs and usage patterns of the application. (The Web logic layer scales out; the relational database, well, can’t).
  • The rigidity of the database schema — the fact that changing the schema once data is inserted is A Big Deal — makes it very difficult to quickly change application behavior, especially if it involves changes to data formats and content.

Recognizing these shortcomings of RDBMSs for modern interactive software applications, developers and practitioners have come up with some workarounds — for example, sharding, denormalizing, and distributed caching — which, while useful to a limited degree, are really just Band-Aids that ease symptoms, but don’t fight the disease.

The Couchbase Why NoSQL paper available here provides more details on this topic.

  1. James Phillips: Senior Vice President of Products at Couchbase, @JamesMPhillips  

Original title and link: RDBMS Shortcomings and Band-Aids or What Lead to NoSQL (NoSQL databases © myNoSQL)


When NoSQL makes better sense than MySQL

TInniam V Ganesh:

However when there is a need to scale the application to be capable of handling millions of transactions the NoSQL model works better. There are several examples of such databases – the more reputed are Google’s BigTable, HBase, Amazon’s Dynamo, CouchDB & MongoDB.

BigTable is available only to Google’s employees, Amazon’s Dynamo only to Amazon employees, CouchDB is not scalable per se, MongoDB scaling architecture is young and complicated. How about this list: Cassandra, HBase, Hypertable, Riak, Project Voldemort, Amazon SimpleDB, Hibari(?), MongoDB?

Original title and link: When NoSQL makes better sense than MySQL (NoSQL databases © myNoSQL)


Why do we need so many different databases?

My ideal database would borrow from RDBMS (like SQL Server), Document databases (like MongoDB), Graph Databases and Semantic Web Triple Stores; it would be the perfect hybrid of all of these and it would configure itself to be as efficient as possible answering queries.

That’s exactly the definition of polyglot persistence.

Every application could benefit of using different data models. The data ingestion module being a document store, a reporting module using relational data, another a graph model, etc.

But if all these models would exist in the same tool that will be a mammoth. It will be a tool good for everything, best at none — doesn’t that sound familiar in a way? Too heavy, too complicated, not agile.

Think of programming languages and multi-paradigms: object-oriented, functional, logic, etc. I’d love to be able to use any of them. But having a single language supporting all of these, I don’t know.

What I’d like is to have the option. And good, or even better, standardized inter-communication. Differently put, what I don’t want is a monolith, nor a highly heterogeneous environment.

Original title and link: Why do we need so many different databases? (NoSQL databases © myNoSQL)


HP CEO about Relational Databases

James Governor reporting from the HP CEO Leo Apotheker keynote at the HP Analyst Summit:

“traditional relational databases are becoming less and less relevant to the future stack”

Even if HP acquired the real-time analytics platform Vertica I haven’t heard of HP in the NoSQL space, so my first thought was this is just the usual attack on competitors.

But it could also express HP’s interest in getting into the NoSQL market. The games of speculations about HP’s acquisitions are open.

  1. James Governor: Co-founder of RedMonk, @monkchips  

Original title and link: HP CEO about Relational Databases (NoSQL databases © myNoSQL)

The NoSQL Gene in SQL Azure Federations

[SQL Azure] Federations bring great benefits of NoSQL model into SQL Azure where it is needed most. I have a special love for RDMSs after having worked on 2, Informix and SQL Server but I also have a great appreciation for NoSQL qualities after having worked on challenging web platforms. These web platforms need flexible app models with elasticity to handle unpredictable capacity requirements and needed the ability to deliver great computational capacity to handle peaks and at the same time deliver that with great economics. NoSQL does bring advantages in this space and I’d argue SQL Azure is inheriting some of these properties of NoSQL through federations.

The way I read it: “we’ve scaled SQL Server as much as we could. Now we need to look at how other scalable distributed systems are built to get us over the deadends we’ve hit”.

Original title and link: The NoSQL Gene in SQL Azure Federations (NoSQL databases © myNoSQL)


Preliminary Comparison of and SQL Azure Features and Capabilities

Extensive comparison of the upcoming and Microsoft’s SQL Azure: will unbundle its underlying relational database engine from when the firm releases’s commercial version in 2011. In the meantime, developers can testdrive with a free developer account, which includes a database having:

  • Three enterprise user accounts
  • 100,000 rows of storage per month
  • 150,000 transactions per month

According to the article, will support ACID transactions (Apex code), triggers and stored procedures (Apex code), relationships, a query language, full-text search. Looks like a relational database in the cloud, but it doesn’t necessarily need to be underneath.

Original title and link: Preliminary Comparison of and SQL Azure Features and Capabilities (NoSQL databases © myNoSQL)