An academic from Drexel University decided to take a novel approach to answering this question. He opened up his HIV dataset to a competition,requiring participants to pick markers in a series of HIV genetic sequences that correlate with a change in viral load (a measure of the severity of infection). Participants would download the data, model it, make their predictions, those predictions were compared with actual outcomes and feedback was presented on a live leaderboard. Amazingly, within a week and a half, the best submission had already outdone the best methods in scientific literature. By the end of the competition, the state of the art had been outperformed by 10 per cent.
Another great example of the unreasonable effectiveness of data.
Original title and link: BigData Applied (NoSQL databases © myNoSQL)