ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

nosql: All content tagged as nosql in NoSQL databases and polyglot persistence

Best NoSQL April’s Fool

I know a few people that avoid the Internet completely on April’s Fool. After being tricked every year by my dad, I’m very careful with what I’m posting on that day. This year has been easy on me, but that doesn’t mean there weren’t a couple of good ones.

My favorites:

Original title and link: Best NoSQL April’s Fool (NoSQL database©myNoSQL)


Cage Match: MySQL vs NoSQL vs Postgres

A post by Brain Aker about the state of MySQL, Postgres and NoSQL databases.

I had a couple of comments and these evolved into a long rant.

MySQL became less interesting once it was acquired […]

I’ve never been very sure what metric is used to measure how interesting the product is. As opposed to some suggestions I’m reading, I haven’t seen stories of people moving away from MySQL because Oracle acquired it. Except Fedora and OpenSUSE replacing MySQL with MariaDB and this due to very specific issues (no security infos, no access to regression tests).

the number of Postgres deployments is greater then what all of the NoSQL market combined adds up to

Comparing 15 years of PosgreSQL with 3 years of NoSQL isn’t going to give meaningful results (for a similar unbalanced comparisons try Oracle vs PostgreSQL). I’m not aware of any database that captured a significant market share in the first 3 years of its existance. Except MySQL. Not Postgres.

Would a document model really matter if schemas could be altered online?

Yes, it would definitely remain relevant. Schema flexibility is not only about updating it, but also about the types allowed. PostgreSQL has indeed added support for arrays and JSON. I see this as a confirmation of what’s happening in the NoSQL space and also about the future of storage engines.

no new language has emerged from the NoSQL market that has any size-able adoption

MongoDB’s query language and the aggregation framework are used by a lot of people. It’s probably not the ideal query language and it comes in two different flavors, but it’s there and it’ll most probably evolve. Biasedly, I could also point to RethinkDB’s data manipulation language for an example of something that is probably on par with SQL and without the hidden unknown corner cases of SQL. Indeed none of these can come close the the adoption acquired by SQL in its 30 years of existance.

Bottom line is that I expect bridges to be built between relational databases and NoSQL databases and each side adopting those features that are useful to their users. I also expect that slowly this relational databases are crap vs NoSQL databases are crap debate will go away, people realizing that the data space is not a zero sum game. Vendors will be the last to give up this fight, but customers have a lot of power in making this happen.

Original title and link: Cage Match: MySQL vs NoSQL vs Postgres (NoSQL database©myNoSQL)

via: http://blog.krow.net/2013/03/mysql-vs-nosql-vs-postgres-vs-sql.html


Traditional, NoSQL and NewSQL Are All Broken. All Data in Memory

Stancey Schneider for VMware:

Over the past few years, memory has gotten cheap and is easily commoditized in the cloud. So moving your data strategy to put it all in-memory just plain makes sense. It eliminates an extra hop to read and write data from disk, making it inherently faster and the performance more consistent. It also manages to simplify the internal optimization algorithms and reduce the number of instructions to the CPU making better use of the hardware.

This is the “conclusion” after “establishing” in the post that:

  1. traditional databases are already broken because of the fixed schemas and data being persisted on disk
  2. NoSQL databases are also broken because even if they have flexible schemas, data is still persisted on disk and “replication takes time to do all the read and writes”
  3. NewSQL are also broken because “the way the databases handles the data distribution makes it so there NewSQL databases do not scale linearly”

All this FUD just to promote GemFire and SQLFire? I really thought VMware is a serious company.

Original title and link: Traditional, NoSQL and NewSQL Are All Broken. All Data in Memory (NoSQL database©myNoSQL)

via: http://blogs.vmware.com/vfabric/2013/03/why-every-database-must-be-broken-soon.html


One Database to Rule Them All?

Curt Monash took upon himself the task of writing about why a data store independent of consistency models, upfront data modeling and access algorithms is almost impossible:

To date, nobody has ever discovered a data layout that is efficient for all usage patterns.

He’s reached a similar conclusion to what I wrote in my link post. Here’s mine:

[…] a database feature an ubiquitous interface independent of consistency models, upfront data modeling, and access algorithms is never going to be efficient. Actually, I’m not even sure it would make sense being built

Here’s Curt Monash’s:

So what would happen if somebody tried to bundle all conceivable functionality into a single DBMS, with a plan to optimize the layout of any particular part of the database as appropriate? I think the outcome would be tears — for the development effort would be huge, while the benefits would be scanty. The most optimistic cost estimates could run in the 100s of millions of dollars, with more realistic ones adding a further order of magnitude. But no matter what the investment, the architects would be on the horns of nasty dilemma

Definitely more impactful.

Original title and link: One Database to Rule Them All? (NoSQL database©myNoSQL)

via: http://www.dbms2.com/2013/02/21/one-database-to-rule-them-all/


A Data Store Independent of Consistency Models, Upfront Data Modeling and Access Algorithms

Tina Groves1 in “Where Does Hadoop Fit in a Business Intelligence Data Strategy?“:

For example, the decision to move and transform operational data to an operational data store (ODS), to an enterprise data warehouses (EDW) or to some variation of OLAP is often made to improve performance or enhance broad consumability by business people, particularly for interactive analysis. Business rules are needed to interpret data and to enable BI capabilities such as drill up/drill down. The more business rules built into the data stores, the less modelling effort needed between the curated data and the BI deliverable.

That’s why Chirag Mehta’s ideal database featuring “an ubiquitous interface independent of consistency models, upfront data modeling, and access algorithms” is never going to be efficient. Actually, I’m not even sure it would make sense being built.


  1. Tina Groves: Product Strategist, IBM Business Intelligence 

Original title and link: A Data Store Independent of Consistency Models, Upfront Data Modeling and Access Algorithms (NoSQL database©myNoSQL)

via: http://www.ibmbigdatahub.com/blog/where-does-hadoop-fit-business-intelligence-data-strategy


NoSQL or NewSQL: The Ideal Database

Talking about ideal database solutions, Chirag Mehta writes in “A Journey From SQL to NoSQL to NewSQL“:

“Design a data store that has ubiquitous interface for the application developers and is independent of consistency models, upfront data modeling (schema), and access algorithms. As a developer you start storing, accessing, and manipulating the information treating everything underneath as a service. As a data store provider you would gather upstream application and content metadata to configure, optimize, and localize your data store to provide ubiquitous experience to the developers. As an ecosystem partner you would plug-in your hot-swappable modules into the data stores that are designed to meet the specific data access and optimization needs of the applications.”

Original title and link: NoSQL or NewSQL: The Ideal Database (NoSQL database©myNoSQL)

via: http://cloudcomputing.blogspot.com/2013/01/a-journey-from-sql-to-nosql-to-newsql.html


The Evolving Database Landscape

Matthew Aslett of 451group published an updated version of the database landscape graphic that is included in the group reports:

DB-landscape

Very similar, but more complete than The Database World in a Venn Diagram from Infochimps.

Original title and link: The Evolving Database Landscape (NoSQL database©myNoSQL)

via: http://blogs.the451group.com/information_management/2012/11/02/updated-database-landscape-graphic/


The Database World in a Venn Diagram

Infochimps put together a comprehensive Venn diagram of the database world in the TechCrunch article Big Data Right Now: Five Trendy Open Source Technologies

The Database World

Original title and link: The Database World in a Venn Diagram (NoSQL database©myNoSQL)


Life of Data at Facebook

Nice screenshot by TechCrunch people of the slide talking about the data lifecycle at Facebook:

life-of-data-at-facebook

Credit TechCrunch

Based on this you’ll now have a better picture of how Facebook data ingestion numbers correlate to their architecture.

Original title and link: Life of Data at Facebook (NoSQL database©myNoSQL)


A Short History of NoSQL, SQL, NoSQL

This post was written by Konstantin Osipov, one of the authors of Tarantool key-value store, and has been posted on the NoSQL Google group.

Way back in the 1960s databases didn’t separate data representation and data access.

To navigate in an index, a database user had to know the physical structure of the index.

Obvious deficiencies of the approach led to introduction of separation of data model and data representation. Relational model is one and still the most popular way to do it.

One of the most well known deficiencies of a relational model is the so-called object-relational impedance mismatch: there is more than one way to map objects to relations, and none of them fits all access patterns well.

It has as well a number of advantages: simplicity, ease of analytical processing, and, let’s not forget, performance: by normalizing data, a user is forced to tell the DBMS more about data constraints, distribution, future access patterns.

This makes building efficient and to-the-point data representation structures easier.

Unfortunately, the past generations of database management systems did not address one of the main architecture drawbacks, which plagues the relational model: rigidity of schema change. Very few mainstream DBMS allow to change the structure of a relational database quickly, without downtime or significant performance penalty. This is not a drawback of the relational model, but of one which relates to the implementation.

It should also be kept in mind that in many cases a relational model is an overkill, and a simple key to value mapping is sufficient.

And of course no single model can fit all needs (e.g. graph databases build around the notion of nodes & edges, yet, good luck trying to quickly calculate CUBE on a bunch of nodes in a graph database).

Unfortunately, the world of NoSQL, when it comes to the data model, often simply takes us back to the 60s: there is minimal abstraction of data access from data representation, and once a certain representation has been chosen, there is no way to change it without rewriting your application (e.g. to fit the new performance profile).

Scalability is an answer, but a silly one: throwing more hardware at a problem is not always economical.

Original title and link: A Short History of NoSQL, SQL, NoSQL (NoSQL database©myNoSQL)


Michael Stonebraker Says in Defense of NewSQL

The reason that this is becoming a hot-button issue is because IT organizations have invested billions of dollars in investments in SQL. Adding new data management frameworks such as Hadoop will add considerable expense in terms of finding people with the skills needed to manage these platforms. Stonebraker isn’t necessarily against Hadoop; he’s just pointing out that there is no one SQL database engine that fits all requirements and that before IT organizations adopt a NoSQL approach, they should consider other SQL-compatible approaches to solving the same problem.

Translated: What do these NoSQL kids know? My products are always the best. So instead of paying them, why not continue paying me.

Original title and link: Michael Stonebraker Says in Defense of NewSQL (NoSQL database©myNoSQL)

via: http://www.itbusinessedge.com/cm/blogs/vizard/in-defense-of-new-sql/?cs=49255


The Oracle NoSQL Database and Big Data Appliance

There’s been a lot of speculation about the announcements coming from Oracle’s OpenWorld event. A first part was revealed during the keynote in the form of an in-memory analytics appliance called Exalytics [2]. But there’s talk about a Big Data Appliance and an Oracle NoSQL database.

Here’re my predictions[1]

  1. Oracle became very aggressive in selling products based on hardware, software, and services. So they’ll announce a Hadoop appliance integrated with an existing Oracle product. It could be either the Oracle Exadata or even the newly announced Exalytics.

    This appliance will place Oracle in competition with all other Hadoop appliance sellers: EMC, NetApp, IBM. Also these days most of the analytics databases try to integrate with Hadoop.

  2. Oracle already has a couple of non-relational solutions in their portfolio: BerkleyDB, TimesTen, Coherence. And they’ve already started to test the NoSQL market by announcing the MySQL and MySQL Cluster NoSQL hybrid systems.

    I don’t expect Oracle NoSQL database to be a new product. Just a rebranding or repackaging of one of the above mentioned ones. Probably the TimesTen.

  3. Oracle will invest more into integrating its line of products with Hadoop. Having both a Hadoop and an in-memory analytics appliance will make them very competitive in this space.

  4. Oracle will extend the support for NoSQLish interfaces (memcached) to its other database products.

What are your predictions?


  1. or speculations  

  2. I’m currently gathering more details about Exalytics.  

Original title and link: The Oracle NoSQL Database and Big Data Appliance (NoSQL database©myNoSQL)