data warehouse: All content tagged as data warehouse in NoSQL databases and polyglot persistence
Quick and Dirty (Incomplete) List of Interesting, Mostly Recent Data Warehousing and Big Data Papers by Peter Bailis
A friend asked me for a few pointers to interesting, mostly recent papers on data warehousing and “big data” database systems, with an eye towards real-world deployments. I figured I’d share the list. While it’s biased and rather incomplete but maybe of interest to someone. While many are obvious choices (I’ve omitted several, like MapReduce), I think there are a few underappreciated gems.
Original title and link: Quick and Dirty (Incomplete) List of Interesting, Mostly Recent Data Warehousing and Big Data Papers by Peter Bailis ( ©myNoSQL)
Ron Bodkin interviewed by Michael Floyd over InfoQ describes the Hadoop growing addiction:
People are using Hadoop for a variety of analytics. Many of the first uses of Hadoop are complementing traditional data warehouses I just mentioned, where the goal is to take some of the pressure of the data warehouse, start to be able to process less structured data more effectively and to be able to do transformations and build summaries and aggregates, but not have to have all that data loaded to the data warehouse. But then the next thing that happens is once people have started doing that level of processing they realize there is a power of being able to ask questions they never thought of before the data, they can store all the data in small samples and they can go back and have a powerful query engine, a cluster of commodity machines that lets them dig into that raw data and analyze it new ways ultimately leading to data science being able to do machine learning and being able to discover patterns in data and keep them improving and refining the data.
The interview is only 16 minutes long and you have the full transcript.
Original title and link: Hadoop and NoSQL in a Big Data Environment with Ron Bodkin ( ©myNoSQL)
most drivers of change in BI and DW concern four Mega-Trends:
- new kinds of data
- increased analytic sophistication
I guess what’s new is the impact of the new kinds of data — I’d probably include here social data, sensor data, the continuously increasing size and new analytic approaches.
Original title and link: 6 Trends Driving Data Warehousing and Business Intelligence (NoSQL databases © myNoSQL)
 about the future of data storage solutions, including NoSQL databases:
Q: Do you foresee any consolidation in the near time?
Bradford: I see actually a proliferation of the open source tools.
We’ve got a ton of key-value stores out there, like Cassandra, Voldemort. I have some feeling that people have very specific requirements that they are going to cook up and open source.
In the document databases world I don’t see anything more than MongoDB, CouchDB, and the few of the others.
I do see consolidation happening in the commercial space, because there’s a lot of vendors out there doing very similar things, especially in the commercial data warehousing space.
And I see a ton of growth in areas like geo data — there’s no stack out there for geo data — and managing time series and other data like that.
Complete interview with O’Reilly’s David Sims below: