NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



theory: All content tagged as theory in NoSQL databases and polyglot persistence

Google Research: Moore’s Law series

Not strictly a NoSQL or even data related series of posts from Google research, but a very interesting read about Moore’s law, what kind of research is happening in this space, and what we need to do to have the same advance in 10-15 years:

This series quotes major sources about Moore’s Law and explores how they believe Moore’s Law will likely continue over the course of the next several years. We will also explore if there are fields other than digital electronics that either have an emerging Moore’s Law situation, or promises for such a Law that would drive their future performance.

  1. Brief history of Moore’s Law and current state
  2. More Moore and More than Moore
  3. Possible extrapolations over the next 15 years and impact
  4. Moore’s Law in other domains

A teaser:

As we look at the years 2020–2025, we can see that the physical dimensions of CMOS manufacture are expected to be crossing below the 10 nanometer threshold. It is expected that as dimensions approach the 5–7 nanometer range it will be difficult to operate any transistor structure that is utilizing the metal-oxide semiconductor (MOS) physics as the basic principle of operation. Of course, we expect that new devices, like the very promising tunnel transistors, will allow a smooth transition from traditional CMOS to this new class of devices to reach these new levels of miniaturization. However, it is becoming clear that fundamental geometrical limits will be reached in the above timeframe. By fully utilizing the vertical dimension, it will be possible to stack layers of transistors on top of each other, and this 3D approach will continue to increase the number of components per square millimeter even when horizontal physical dimensions will no longer be amenable to any further reduction. It seems important, then, that we ask ourselves a fundamental question: “How will we be able to increase the computation and memory capacity when the device physical limits will be reached?” It becomes necessary to re-examine how we can get more information in a finite amount of space.

Original title and link: Google Research: Moore’s Law series (NoSQL database©myNoSQL)

Carnegie Mellon’s Introduction to machine learning course

Carnegie Mellon has made available its “Introduction to machine learning” course available online. Free.

No excuses.

Daniel Gutierrez

Original title and link: Carnegie Mellon’s Introduction to machine learning course (NoSQL database©myNoSQL)

What Is a Data Fabric Architecture?

Great educational post by Robert Hodges about what is a data fabric, the architecture of data fabrics and underlying high level implementation details:

It must hide failures, maintenance, and schema upgrades on individual DBMS hosts. It must permit data to distribute across geographic regions. It must deliver and accept transactions from NoSQL, data warehouses, and commercial DBMS in real time. It must allow smooth technology upgrade and replacement. Finally, it must look as much like a single DBMS server to applications as possible.

I’ve learned a couple of things from it:

  1. the expectations and requirements, plus possible approaches for designing data-as-a-service systems
  2. the challenges, complexity and trade-offs involved in scaling relational databases.

Original title and link: What Is a Data Fabric Architecture? (NoSQL database©myNoSQL)


Test Driving Database Indexes

Myron Marston:

Database indexes are conceptually very simple, but in practice, I’ve found that it’s hard to predict when they’ll get used and what indexes a given table needs. On a project at work I came up with the idea to test-drive my database indexes, just like I test-drive the rest of my code. I’d like to share the approach I came up with.

A very interesting idea at least for MySQL users.

Original title and link: Test Driving Database Indexes (NoSQL database©myNoSQL)


New Fibre Transfer Speed Record

Researchers have set a new record for the rate of data transfer using a single laser: 26 terabits per second.

The only question now is how soon we will see these speeds live.

Original title and link: New Fibre Transfer Speed Record (NoSQL databases © myNoSQL)


No Relation: The Mixed Blessings of Non-Relational Databases

A paper by Ian Thomas Varley, M.S.E. covering the following aspects of non-relational databases:

  • use cases
  • pros and cons
  • design strategies

The paper in PDF format can be downloaded from ☞ here

The Confused World of "NoSQL"

I believe it would be beneficial to seperate these use-cases and treat them differently (eg. call one NoSQL and the other DDS for Distributed Data Store).

I’d argue that we should firstly get these use cases.


Query Processing for NOSQL DB

Nevetheless, compare to the more mature RDBMS, NoSQL has some fundamental limitations that we need to be aware of.