ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

statistics: All content tagged as statistics in NoSQL databases and polyglot persistence

Statistical Advances: The Maximal Information Coefficient a New Method to Uncover Hidden Data Relationships

Yakir Reshef (main researcher):

“If you have a data set with 22 million relationships, the 500 relationships in there that you care about are effectively invisible to a human.”

The statistical method that Reshef and his colleagues have devised aims to crack those problems. It can spot many superimposed correlations between variables and measure exactly how tight each relationship is, on the basis of a quantity that the team calls the maximal information coefficient (MIC). The MIC is calculated by plotting data on a graph and looking for all ways of dividing up the graph into blocks or grids that capture the largest possible number of data points. MIC can then be deduced from the grids that do the best job.

The original article, Detecting Novel Associations in Large Data Sets, was published on Science, but is behind a paywall.

Original title and link: Statistical Advances: The Maximal Information Coefficient a New Method to Uncover Hidden Data Relationships (NoSQL database©myNoSQL)

via: http://www.nature.com/news/tangled-relationships-unpicked-1.9660