microsoft: All content tagged as microsoft in NoSQL databases and polyglot persistence
Sunday, 5 February 2012
Research in the MapReduce Space
Over the weekend I’ve read two papers presenting products or research related to improving or adding new capabilities to the MapReduce data processing approach. The first of them comes from a team at Microsoft and is describing TiMR a time-oriented data processing system in MapReduce. The second, from a team at Google, presents Tenzin - a SQL implementation on the MapReduce framework. It’s great to learn that while the Hadoop community is eliminating some of the initial limitations and hardening the technical details of the platform, there are already ideas and systems out there that augment the capabilities of the MapReduce data processing model.
Original title and link: Research in the MapReduce Space (©myNoSQL)
Wednesday, 1 February 2012
Microsoft, Hadoop, and Open Source Contributions
Edd Dumbill:
Microsoft’s goals go beyond integrating Hadoop into Windows. It intends to contribute the adaptions it makes back to the Apache Hadoop project, so that anybody can run a purely open source Hadoop on Windows.
In the open source world contributions are measured in code or documentation or donations. Less so in interviews or PR announcements.
So far Microsoft doesn’t seem to know this game. But if its intentions are true, the community will help.
Original title and link: Microsoft, Hadoop, and Open Source Contributions (©myNoSQL)
via: http://radar.oreilly.com/2012/01/microsoft-big-data.html
Wednesday, 25 January 2012
12 Hadoop Vendors to Watch in 2012
My list of 8 most interesting companies for the future of Hadoop didn’t try to include anyone having a product with the Hadoop word in it. But the list from InformationWeek does. To save you 15 clicks, here’s their list:
- Amazon Elastic MapReduce
- Cloudera
- Datameer
- EMC (with EMC Greenplum Unified Analytics Platform and EMC Data Computing Appliance)
- Hadapt
- Hortonworks
- IBM (InfoSphere BigInsights)
- Informatica (for HParser)
- Karmasphere
- MapR
- Microsoft
- Oracle
Original title and link: 12 Hadoop Vendors to Watch in 2012 (©myNoSQL)
Tuesday, 10 January 2012
Partnerships in the Hadoop Market
Just a quick recap:
- Cloudera: Oracle, Dell, NetApp
- Hortonworks: Microsoft
- MapR: EMC (integration with Greenplum HD)
Amazon doesn’t partner with anyone for their Amazon Elastic Map Reduce. And IBM is walking alone with the software-only InfoSphere BigInsights.
Original title and link: Partnerships in the Hadoop Market (©myNoSQL)
Wednesday, 4 January 2012
Claim Chowder: Microsoft’s Dryad Technology to Take on Google’s MapReduce
In Dec.2010, Joab Jackson writes for IDG News Service: Microsoft’s Dryad technology to take on Google’s MapReduce. Just 11 months later, in Nov.2011, Doug Henschen writes for the same IDG News Service: Microsoft Ditches Dryad, Focuses On Hadoop - Software.
Nothing wrong with Microsoft decision. Same cannot be said though about the titles and articles published by the IDG News Service network.
Original title and link: Claim Chowder: Microsoft’s Dryad Technology to Take on Google’s MapReduce (©myNoSQL)
Wednesday, 21 December 2011
Hadoop: Amazon Elastic MapReduce and Microsoft Project Isotop
This is how things are rolling these days. Microsoft talks about offerring Hadoop integration with Project Isotop in 2012, Amazon is announcing immediate availability of new beefed instances (Cluster Compute Eight Extra Large (cc2.8xlarge)) and reduced prices for some of the existing instances.
Original title and link: Hadoop: Amazon Elastic MapReduce and Microsoft Project Isotop (©myNoSQL)
Project Isotope Will Bring Together Hadoop Toolchain With Microsoft’s Data Products
There’s a series of events lately that makes me think Microsoft is nowhere near accepting defeat in the cloud services area. As regards Microsoft’s Project Isotop, things are much simpler than ZDNet article make them sound[1]: Microsoft is working on integrating Hadoop and its toolchain with their own products (SQL Server Analysis Services, PowerPivot).

A picture worth more than the 626 words.
-
I bet the details of integration are fascinating and far from being simple, but the article is not focusing on those ↩
Original title and link: Project Isotope Will Bring Together Hadoop Toolchain With Microsoft’s Data Products (©myNoSQL)
Monday, 19 December 2011
SQL Azure Federation... Aka Sharding
One of the exciting new features in the just-released SQL Azure Q4 2011 Service Release is SQL Azure Federation. In a sentence, SQL Azure Federation enables building elastic and scalable database tiers.
We all know the benefits of sharding so why calling it differently? NIH?
Original title and link: SQL Azure Federation… Aka Sharding (©myNoSQL)
Wednesday, 30 November 2011
Data Is the New Currency. But Who’s Leading the Way?
In 2005, Tim O’Reilly said: “data is the next Intel Inside“. Today IDC Mario Morales (VP of semiconductor research) says data is the new currency. All’s good until you read the continuation:
And the companies that understand this are the ones already developing the analytics and infrastructure to extract that value—companies like IBM, HP, Intel, Microsoft, TI, Freescale and Oracle.
The article (nb: may require registration) continues by looking at what each of these companies are doing in the Big Data space, but focuses a large part on IBM Watson.
Going back to the question “who’s leading the Big Data way“, let’s take a quick look at the technology behind Watson. According to Jeopardy Goes to Hadoop and About Watson, Watson technology is based on Apache Hadoop, using an IBM language technology built on the Apache UIMA platform[1] and running Linux on IBM boxes.
To me it looks like open source is leading the advances in Big Data and these large organizations are just connecting the dots (as in packaging these technologies for enterprise environments and contributing missing pieces here and there)[2]. When did this happen before?
-
Dmitriy Ryaboy taught me that UIMA came out of IBM in the first place and they’ve been critical in its development. ↩
-
Or they are very secretive about their internal initiatives and research. ↩
Original title and link: Data Is the New Currency. But Who’s Leading the Way? (©myNoSQL)
Thursday, 16 June 2011
Oracle and IBM May Not Know Big Data, but Neither Does Ballmer
Specifically, for a data processing and analytics project to qualify as Big Data, it must encompass not just internal corporate data, but also third-party data that resides outside the firewall, according to Ballmer. He said IBM and Oracle limit their Big Data approaches to internal data, thus they are not in fact Big Data by his definition.
[…]
IBM, Oracle and now Microsoft are jockeying to position each of their approaches to Big Data as the industry standard, and Ballmer is clearly trying to steer the Big Data conversation towards Microsoft’s strengths and away from its weaknesses. That means talking up Microsoft’s ability to integrate third-party data with relatively large volumes of corporate data inside Microsoft’s SQL Server R2 Parallel Data Warehouse and away from its lack of petabyte-scale data processing power.
I guess there will be no end to the Oracle-IBM-Microsoft triangle love, so I’ll stop here until real facts are added to the story.
Original title and link: Oracle and IBM May Not Know Big Data, but Neither Does Ballmer (NoSQL database©myNoSQL)
via: http://wikibon.org/blog/oracle-and-ibm-may-not-know-big-data-but-neither-does-ballmer/
Wednesday, 15 June 2011
Steve Ballmer on Microsoft and Big Data
InformationWeek quoting Steve Ballmer:
Nobody plays in big data, really, except Microsoft and Google
[…]
I’ll use the word ‘data’ rather than ‘BI’ because that says I want to use all the world’s information… not just the information that we figured out how to capture inside our corporate system.
[…]
The explosion in the use of data is not always in a traditional BI-ish way, and that’s a big thing for us
But what is behind these words? The InformationWeek article builds around Steve Balmer’s vision that based on the knowledge gained creating and operating Bing and AdCenter, Microsoft is better prepared to handle BigData than EMC, HP, IBM, Oracle, and Teradata. While this can be true, there’s an important difference to draw here: creating and operating internal processes and tools is a different business than developing, selling, and supporting tools for customers. Microsoft has experience in both these fields, but their current products are not yet combining the know-how in these two areas. Meanwhile the companies mentioned above, plus quite a few startups, and open source projects are betting their future on this market alone by staying focused.
Original title and link: Steve Ballmer on Microsoft and Big Data (NoSQL database©myNoSQL)
via: http://www.informationweek.com/news/software/bi/230700013
Most Popular Articles
- Translate SQL to MongoDB MapReduce
- Tutorial: Getting Started With Cassandra
- CouchDB vs MongoDB: An attempt for a More Informed Comparison
- Cassandra @ Twitter: An Interview with Ryan King
- A Couple of Nice GUI Tools for MongoDB
- NoSQL benchmarks and performance evaluations
- Ehcache: Distributed Cache or NoSQL Store?
- Document Databases Compared: CouchDB, MongoDB, RavenDB
- Quick Review of Existing Graph Databases
- NoSQL Data Modeling