NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



Oracle Big Data Appliance: All content tagged as Oracle Big Data Appliance in NoSQL databases and polyglot persistence

Oracle Big Data Appliance Released Features Cloudera Distribution of Hadoop: What You Need to Know

Oracle Big Data Appliance hardware specification

Klint Finley for ServicesANGLE:

18 Oracle Sun servers with a total of:

  • 864 GB main memory;
  • 216 CPU cores;
  • 648 TB of raw disk storage;
  • 40 Gb/s InfiniBand connectivity between nodes and other Oracle engineered systems; and,
  • 10 Gb/s Ethernet data center connectivity.

Joab Jackson for PCWorld Business Center:

The package includes 40Gb/s InfiniBand connectivity among the nodes, a rarity among Hadoop deployments, many of which use Ethernet to connect the nodes. Lumpkin said InfiniBand would speed data transfers within the system. Multiple racks can be tethered together in a cluster configuration. There is no theoretical limit to how many racks can be clustered together, though configurations of more than eight racks would require additional switches, Lumpkin said.

Oracle Big Data Appliance software specification

  • Cloudera’s Distribution including Apache Hadoop
  • Cloudera Manager
  • Open source distribution of R
  • Oracle NoSQL Database Community Edition
  • Oracle Big Data Connectors
  • Oracle Linux

Joab Jackson for PCWorld Business Center:

Along with the release, Oracle also released Oracle Big Data Connectors, a set of drivers for exchanging data between the Big Data Appliance and other Oracle products, such as the Oracle Database 11g, the Oracle Exadata Database Machine, Oracle Exalogic Elastic Cloud and Oracle Exalytics In-Memory Machine.

Derrick Harris for GigaOm:

However, Oracle isn’t blind to the fact that not everyone will be gung ho about buying an appliance. Its custom-built Big Data Connectors are available as separate products for those customers wanting to connect existing Hadoop clusters to Oracle database environments or R statistical-analysis environments.

Klint Finley for ServicesANGLE:

According to Oracle’s announcement “The integrated Oracle and Cloudera architecture has been fully tested and validated by Oracle, who will also collaborate with Cloudera to provide support for Oracle Big Data Appliance.”

Oracle Big Data Appliance Services

George Lumpkin, Oracle’s vice president of data warehousing product management:

Oracle will provide first-line support for the appliance and all software (including the Hadoop distribution and Cloudera Manager) through its case-tracking support infrastructure. But when particularly tough support cases arise, Oracle will tap Cloudera’s expertise.

What’s more, Oracle will refer customers to Cloudera for Hadoop training and consulting engagements.

Oracle Big Data Appliance Positioning

George Lumpkin, Oracle’s vice president of data warehousing product management:

We are positioning this as something that runs alongside other Oracle-based systems. Big data is more than just a cluster of hardware running Hadoop. It is an overall information architecture for enabling companies to analyze data and make decisions.

Doug Hanshen for Informationweek:

Oracle highlighted the Big Data Appliance as a complement to a growing family of “engineered systems” that now includes Exadata, Exalogic, and the Exalytics In-Memory Machine.

Merv Adrian (Gartner analyst) cited by Informationweek:

But what’s more remarkable is the fact that Oracle is finally looking beyond its core database. Oracle’s TimesTen and Essbase databases, which were recently upgraded for use in the Exalytics appliance, and BerkeleyDB, which was Oracle’s development starting point for the new NoSQL database, are examples of that shift.

Oracle is suddenly beginning to act as a data-management portfolio company, not just a company with a big brother and a bunch of starving siblings.

Joab Jackson for PCWorld Business Center:

Oracle is positioning the appliance for managing and analyzing large sets of data that may be too large, or otherwise unsuitable for keeping in databases, such as telemetry data, click-stream data or other log data. “You may not want to keep the data in a database, but you do want to store it and analyze it,” Lumpkin said. The appliance is intended for those organizations that want to undertake Big Data-style analysis but may not have the in-house expertise to assemble large Hadoop or NoSQL-based systems.


Kurt Dunn, Cloudera’s chief operating officer told InformationWeek.

Oracle has put together a very comprehensive product that is priced very well.

Brian Proffitt for ITworld:

The cost of the Big Data Appliance is what will really stand out. At $500,000, this may not seem like a bargain, but in reality it is. Typically, commoditized Hadoop systems run at about $4,000 a node. To get this much data storage capacity and power, you would need about 385 nodes… which puts the price tag at around $1.54 million—three times the price of Oracle’s Cloudera-based offering (which, I should add, excludes things like support costs and power).

Doug Hanshen for Informationweek:

The hardware and software combined will sell for $450,000, with an annual support fee for both hardware and software of 12%. That’s highly competitive, working out to less than $700 per terabyte and being in line with the low costs big data practitioners expect from deployments built on commodity hardware.

Oracle - Cloudera Parternship

I wrote earlier my take on what this partnership means to both Oracle and Cloudera.

Doug Hanshen for Informationweek:

But by releasing the product early in the year in partnership with Cloudera, which has more customers and years in the market than any other Hadoop software and services provider, Oracle has made it clear that it is wasting no time and taking no chances with unproven technology.

“Cloudera brings us a couple of very important missing pieces, including its management software and assistance for a deeper second- and third-tier level of support,” said George Lumpkin, Oracle’s vice president of product management, data warehousing.

Speculations about the future of the Oracle - Cloudera partnership

Brian Proffitt for ITworld:

Students of Linux history will well remember that’s exactly what happened when Oracle partnered with Red Hat to introduce commoditized Oracle offerings… and then Larry Ellison and crew decided to roll their own Oracle Enterprise Linux in 2006 when they decided to cut Red Hat out of the stack.

This is strong historical evidence that Oracle will do the same with Cloudera, because frankly the big data market is too big for Oracle not to want to own. Big Data Appliance customers should note this, and be very prepared that future versions may not be tied to Cloudera at all, but rather Oracle’s version of Hadoop.

A few people suggested on Twitter that this partnership is a sign of a possible Oracle’s acquisition of Cloudera. TechCrunch’s Leena Rao links to an old post by Matt Asay suggesting this acquisition.

Media coverage of Oracle Big Data Appliance

Original title and link: Oracle Big Data Appliance Released Features Cloudera Distribution of Hadoop: What You Need to Know (NoSQL database©myNoSQL)

Oracle Big Data Appliance Roundup: What, Why, How

Oracle Big Data Appliance Sales Pitch

The Oracle Database Insider Blog:

Offering customers an end-to-end solution for Big Data, the Oracle Big Data Appliance, in conjunction with Oracle Exadata Database Machine and the new Oracle Exalytics Business Intelligence Machine, delivers everything customers need to acquire, organize, analyze and maximize the value of Big Data within their enterprise.

What’s in the box?

The Oracle Database Insider Blog:

  • Oracle Big Data Appliance: The Oracle Big Data Appliance is an engineered system optimized for acquiring, organizing and loading unstructured data into Oracle Database 11g.
  • Oracle Data Integrator Application Adapter for Hadoop: The new Hadoop adapter simplifies data integration from Hadoop and an Oracle Database through Oracle Data Integrator’s easy to use interface.
  • Oracle Loader for Hadoop: Oracle Loader for Hadoop enables customers to use Hadoop MapReduce processing to create optimized data sets for efficient loading and analysis in Oracle Database 11g. Unlike other Hadoop loaders, it generates Oracle internal formats to load data faster and use less database system resources.
  • Oracle R Enterprise: Oracle R Enterprise integrates the open-source statistical environment R with Oracle Database 11g. Analysts and statisticians can run existing R applications and use the R client directly against data stored in Oracle Database 11g, vastly increasing scalability, performance and security. The combination of Oracle Database 11g and R delivers an enterprise-ready deeply-integrated environment for advanced analytics.

The Oracle Big Data Appliance official page is here.

Oracle Big Data Appliance Market Positioning

Ashok Bindra:

Engineered to work together, the Oracle Big Data Appliance is easily integrated with Oracle Database 11g, Oracle Exadata Database Machine, and Oracle Exalytics Business Intelligence Machine. In essence, said oracle, it is designed to deliver extreme analytics on all data types, with enterprise-class performance, availability, supportability and security.

Shaun Nichols:

Mendelsohn said the company would pitch the Big Data Appliance as a companion to the Exadata platform and an additional tool for understanding customer behaviour rather than just another repository for information.

“Big is interesting, but traditional warehouses deal with that quite well,” he explained.

Jaikumar Vijayan quoting James Kobielus (Forrester Research):

Today’s announcement is likely to put pressure on rivals such as Teradata, IBM, SAP, Microsoft and EMC to ramp up their own offerings. The onus is on them to “match and surpass Oracle in their roadmaps, offerings and partnerships,” Kobielus said. “Forrester expects M&A activity in these arenas to ramp up now that Oracle has made these aggressive moves.”

Chris Kanaracus:

Pricing and a release date for the machine weren’t immediately available on Monday. When available, it will compete with products such as Aster Data, Netezza and Greenplum.

Oracle Big Data Appliance Technical Details

Oracle Big Data Appliance

Alex Gorbachev:

A rack with InfiniBand, full of 2U servers similar to Exadata Storage. No flash storage needed so couple sockets and a dozen of disks will do. Maybe more ram than Exadata storage cells themselves. I suspect you could have as many servers as you want in a configuration but since Hadoop clusters are usually dozens and more nodes, full rack seems reasonable with about 20 Hadoop compute nodes to start with. Real deployments should easily go into multiple racks stacked together.

Timothy Prickett Morgan:

The underlying hardware for the Big Data Appliance is Oracle’s Exadata x86 clusters, which support a parallel implementation of the Oracle 11g R2 database running on top of Oracle’s RHEL-ish clone of Linux. Oracle Enterprise Linux and Oracle’s twist on the open source Xen hypervisor are the appliance’s underlying layer.

Shaun Nichols:

The rack-based appliance will house 18 server systems and will hold up to 432TB of data and 864GB of memory. The appliance will form the basis of the company’s push into the big data management and analysis space.

Gwen Shapira:

The Big Data Appliance (BDA) has 18 Sun x4270 M2 servers per rack. As usual, you can add racks together for larger clusters. Each node has 48G RAM, 12 intel cores and 24Tb of storage. Less memory than in the Exadata 2×2 nodes and no SSD indicates that the plan is to hit the spinning magnetic devices a lot for data storage and processing. Not a big deal in Hadoop where this is the design assumption, but not optimal for the NoSQL portion of the device.

In addition there is 40gb/s infiniband and 10g/s Ethernet. The choice of infiniband for Hadoop machine is a bit odd, since Hadoop was designed to do most of the processing on the machine that holds the data and avoid overloading the network. On the other hand, connecting the Hadoop cluster to an Exadata machine with infiniband will allow for fast data loading. Which is exactly what Oracle is after.

Thomas Kurian (Oracle EVP):

ETL can deploy on the Hadoop cluster and you can model that using Oracle Integrator ETL tool and then deploy that on Hadoop MapReduce platform. We provide load balancing and after preprocessing is done, [the loader moves] the data set into Oracle. The finished data set then can be piped into Exalytics for  analytic dashboards and reports.

Oracle Big Data Appliance: What does it mean to the market and competitors?

Billy Bosworth (DataStax):

I have been around databases for 20 years, and have tons of respect for Oracle.  When someone of their caliber releases a NoSQL solution, it takes us beyond the era of speculation and “niche” and squarely into the mainstream.  It validates our work and our passion and paints a very exciting future for big data databases.

Edd Dumbill:

Whether you use Oracle or not, today’s announcement moves the big data world forward. We have de facto agreement on Hadoop and R as core infrastructure, and we have healthy competition at the database and NoSQL layer.

Max Schireson (10gen):

In my opinion this is a good thing for alternative database vendors. Competition is already thriving in the sector and I don’t think one more competitor, even one as large as Oracle, will alter the  dynamics dramatically. But many customers will take Oracle’s arrival in the space as a sign that this trend is significant and it is a space they should look at. If Oracle’s offering is strong, we may lose some market share to them, but their presence will make it a bigger market.

Klint Finley:

One of the big issues at play here is whether enterprises want expensive Oracle appliances, open core software running on commodity hardware or pay-as-you-go public cloud services. As Wikibon analyst Jeff Kelley notes, “Ellison knows Oracle needs to have some Hadoop/NoSQL offering, but the open source/commodity hardware/scale-out approach to Big Data is the antithesis of the Oracle way: closed source/Sun-only hardware/scale-up.”

French Caldwell (Gartner):

Got big data problems?  Got cloud angst?   Just put all your worries in a big iron box.  At least that’s what I took away after two hours of keynotes from Oracle and EMC executives this morning.   Big data and the cloud are euphemisms for huge information management and business challenges, but listening to the keynotes, you’d think it’s just a technical problem.  The proliferation of vast amounts of unstructured content and a revolution in IT provisioning models, and even digital dependent revenue streams are not issues to be trifled with.  But at the opening of Open World, the dumbing down of these challenges is exactly what happened.  The vision communicated is that the solution is that you can put it all in a big data box, or a BI machine.


Ashok Bindra:

According to Oracle, the Big Data Appliance is a new system that includes an open source distribution of Apache Hadoop, Oracle NoSQL Database, Oracle Data Integrator Application Adapter for Hadoop, Oracle Loader for Hadoop, and an open source distribution of R.


My predictions turned true. Almost all.

Original title and link: Oracle Big Data Appliance Roundup: What, Why, How (NoSQL database©myNoSQL)

R: the Leading Statistics Language and Key Weapon in Advanced Analytics Today

David Smith (Revolution Analytics):

Of course, this isn’t the first time that R has been embedded into a data warehousing appliance. IBM Netezza’s iClass device integrates with Revolution R, and AsterData, the Teradata Data Warehouse Appliance, and Greenplum all provide connections to R as well. Here at Revolution Analytics, we think that such enterprise-level integrations with R serve to grow the R ecosystem and serve as validation of R as a key platform for advanced analytics. As CEO Norman Nie said to GigaOm this weekend, 

“Oracle’s announcement to embed R demonstrates validation for the leading statistics language and offers further evidence that R is a key weapon in advanced analytics today”

And let’s not leave aside the strategic partnership between Revolution Analytics and Cloudera to include RevoConnectR in the CDH.

Original title and link: R: the Leading Statistics Language and Key Weapon in Advanced Analytics Today (NoSQL database©myNoSQL)


Hadoop and NoSQL Mythbusting

Gwen Shapira:

With all the buzz in OOW about the big data machine, there was also a lot of non-sense flying around. I love it that the Oracle community is finally interested in Hadoop and NoSQL, but I hate it when people sound authoritative without having an actual clue. I’ve left a few presentations with smoke coming out of my ears.

The one that seems to be integral part of the Oracle Big Data Appliance message is that “Hadoop can only be used for basic ETL transformations. Real data analysis has to be done in Oracle and BI tools“. Unfortunately I’ve heard the same thing coming from IBM Netezza.

Original title and link: Hadoop and NoSQL Mythbusting (NoSQL database©myNoSQL)