Yakir Reshef (main researcher):
“If you have a data set with 22 million relationships, the 500 relationships in there that you care about are effectively invisible to a human.”
The statistical method that Reshef and his colleagues have devised aims to crack those problems. It can spot many superimposed correlations between variables and measure exactly how tight each relationship is, on the basis of a quantity that the team calls the maximal information coefficient (MIC). The MIC is calculated by plotting data on a graph and looking for all ways of dividing up the graph into blocks or grids that capture the largest possible number of data points. MIC can then be deduced from the grids that do the best job.
The original article, Detecting Novel Associations in Large Data Sets, was published on Science, but is behind a paywall.
Original title and link: Statistical Advances: The Maximal Information Coefficient a New Method to Uncover Hidden Data Relationships ( ©myNoSQL)