NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



Follow the Data

From the series “follow the money”, “cherchez la femme”, now via Jim Harris[1]follow the data:

Data is often seen as just a by-product of business and technical processes, but a common root cause of poor data quality is this lack of awareness of the end-to-end process of how the organization is using its data to support its business activities. […] As a result, when an error occurs, it manifests itself in a downstream application, and it takes a long time to figure out where the error occurred and how it was related to the negative impacts.

Can you imagine how complicated this could be with an architecture like Digg’s? The only other solution than documenting the data flow would be to design everything around data islands — self-sufficient data stores that rarely or not at all interact with each other. But that would mean loosing the whole value of connecting the dots.

  1. I like the name of Jim’s blog: “Obsessive-Compulsive Data Quality”  

Original title and link: Follow the Data (NoSQL databases © myNoSQL)