NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



appliance: All content tagged as appliance in NoSQL databases and polyglot persistence

Dell and Cloudera and Intel join forces for appliances

Me in Intel kills a Hadoop and feeds another:

As for Intel, what if this investment also sealed an exclusive deal for Hadoop-centric Cloudera-supported Intel-powered appliance?

I didn’t know about the existing Dell-Cloudera-Intel partnership, but this is re-inforced with the recent announcement of an in-memory appliance.

Since 2011, Cloudera, Dell and Intel have built pre-validated reference architectures for Hadoop. […]

The Dell In-Memory Appliances for Cloudera Enterprise is yet another proof point of the collaboration and synergies between the three companies. As the first of a family of appliances, it includes leading Dell hardware, Cloudera’s enterprise data hub -based on Cloudera Enterprise, Intel architecture for fast processing, and ScaleMP’s Versatile SMP (vSMP) architecture to aggregate multiple x86 servers into a single virtual machine to create large memory pools for in-memory processing.

Original title and link: Dell and Cloudera and Intel join forces for appliances (NoSQL database©myNoSQL)

Cloudera Distribution of Hadoop Powers Oracle’s Big Data Appliance

The announcement of the Oracle Big Data Appliance was out for a couple of hours and already hit all media sites. Before looking at the details of the announcement, let’s try to understand what this announcement means for the parties involved.

What does it mean for Oracle?

  • Oracle enters a very busy Hadoop market associated with the best known company in the Hadoop ecosystem
  • With this partnership, Oracle didn’t have to make a huge investment in software development or services
  • Not having to build its own distribution of Hadoop, Oracle could focus on developing the Oracle Big Data Connectors
  • Oracle will delegate everything Hadoop to Cloudera thus it won’t have to deal with a very fast evolving open source project that might see some interesting events due to the
  • Oracle seems to have changed the message about Hadoop being used only for basic ETL.

What does it mean for Cloudera?

  • Cloudera gets access to a pool of customers (many of them possibly very large customers)
  • Cloudera will not need a big sales force to reach to these possible customers. Even if Cloudera knew about them, Oracle’s sales force will do the job
  • If Oracle spells Cloudera’s name in every sales pitch, Cloudera will see a huge publicity bump that will sooner or later lead to more customers

Truth is I was expecting yet another distribution of Hadoop. And even if Oracle’s Big Data Appliance doesn’t feature the official Apache Hadoop distribution, I think that by choosing an existing distribution, Oracle did the right thing. For them and for their customers.

Original title and link: Cloudera Distribution of Hadoop Powers Oracle’s Big Data Appliance (NoSQL database©myNoSQL)

Oracle Big Data Appliance Roundup: What, Why, How

Oracle Big Data Appliance Sales Pitch

The Oracle Database Insider Blog:

Offering customers an end-to-end solution for Big Data, the Oracle Big Data Appliance, in conjunction with Oracle Exadata Database Machine and the new Oracle Exalytics Business Intelligence Machine, delivers everything customers need to acquire, organize, analyze and maximize the value of Big Data within their enterprise.

What’s in the box?

The Oracle Database Insider Blog:

  • Oracle Big Data Appliance: The Oracle Big Data Appliance is an engineered system optimized for acquiring, organizing and loading unstructured data into Oracle Database 11g.
  • Oracle Data Integrator Application Adapter for Hadoop: The new Hadoop adapter simplifies data integration from Hadoop and an Oracle Database through Oracle Data Integrator’s easy to use interface.
  • Oracle Loader for Hadoop: Oracle Loader for Hadoop enables customers to use Hadoop MapReduce processing to create optimized data sets for efficient loading and analysis in Oracle Database 11g. Unlike other Hadoop loaders, it generates Oracle internal formats to load data faster and use less database system resources.
  • Oracle R Enterprise: Oracle R Enterprise integrates the open-source statistical environment R with Oracle Database 11g. Analysts and statisticians can run existing R applications and use the R client directly against data stored in Oracle Database 11g, vastly increasing scalability, performance and security. The combination of Oracle Database 11g and R delivers an enterprise-ready deeply-integrated environment for advanced analytics.

The Oracle Big Data Appliance official page is here.

Oracle Big Data Appliance Market Positioning

Ashok Bindra:

Engineered to work together, the Oracle Big Data Appliance is easily integrated with Oracle Database 11g, Oracle Exadata Database Machine, and Oracle Exalytics Business Intelligence Machine. In essence, said oracle, it is designed to deliver extreme analytics on all data types, with enterprise-class performance, availability, supportability and security.

Shaun Nichols:

Mendelsohn said the company would pitch the Big Data Appliance as a companion to the Exadata platform and an additional tool for understanding customer behaviour rather than just another repository for information.

“Big is interesting, but traditional warehouses deal with that quite well,” he explained.

Jaikumar Vijayan quoting James Kobielus (Forrester Research):

Today’s announcement is likely to put pressure on rivals such as Teradata, IBM, SAP, Microsoft and EMC to ramp up their own offerings. The onus is on them to “match and surpass Oracle in their roadmaps, offerings and partnerships,” Kobielus said. “Forrester expects M&A activity in these arenas to ramp up now that Oracle has made these aggressive moves.”

Chris Kanaracus:

Pricing and a release date for the machine weren’t immediately available on Monday. When available, it will compete with products such as Aster Data, Netezza and Greenplum.

Oracle Big Data Appliance Technical Details

Oracle Big Data Appliance

Alex Gorbachev:

A rack with InfiniBand, full of 2U servers similar to Exadata Storage. No flash storage needed so couple sockets and a dozen of disks will do. Maybe more ram than Exadata storage cells themselves. I suspect you could have as many servers as you want in a configuration but since Hadoop clusters are usually dozens and more nodes, full rack seems reasonable with about 20 Hadoop compute nodes to start with. Real deployments should easily go into multiple racks stacked together.

Timothy Prickett Morgan:

The underlying hardware for the Big Data Appliance is Oracle’s Exadata x86 clusters, which support a parallel implementation of the Oracle 11g R2 database running on top of Oracle’s RHEL-ish clone of Linux. Oracle Enterprise Linux and Oracle’s twist on the open source Xen hypervisor are the appliance’s underlying layer.

Shaun Nichols:

The rack-based appliance will house 18 server systems and will hold up to 432TB of data and 864GB of memory. The appliance will form the basis of the company’s push into the big data management and analysis space.

Gwen Shapira:

The Big Data Appliance (BDA) has 18 Sun x4270 M2 servers per rack. As usual, you can add racks together for larger clusters. Each node has 48G RAM, 12 intel cores and 24Tb of storage. Less memory than in the Exadata 2×2 nodes and no SSD indicates that the plan is to hit the spinning magnetic devices a lot for data storage and processing. Not a big deal in Hadoop where this is the design assumption, but not optimal for the NoSQL portion of the device.

In addition there is 40gb/s infiniband and 10g/s Ethernet. The choice of infiniband for Hadoop machine is a bit odd, since Hadoop was designed to do most of the processing on the machine that holds the data and avoid overloading the network. On the other hand, connecting the Hadoop cluster to an Exadata machine with infiniband will allow for fast data loading. Which is exactly what Oracle is after.

Thomas Kurian (Oracle EVP):

ETL can deploy on the Hadoop cluster and you can model that using Oracle Integrator ETL tool and then deploy that on Hadoop MapReduce platform. We provide load balancing and after preprocessing is done, [the loader moves] the data set into Oracle. The finished data set then can be piped into Exalytics for  analytic dashboards and reports.

Oracle Big Data Appliance: What does it mean to the market and competitors?

Billy Bosworth (DataStax):

I have been around databases for 20 years, and have tons of respect for Oracle.  When someone of their caliber releases a NoSQL solution, it takes us beyond the era of speculation and “niche” and squarely into the mainstream.  It validates our work and our passion and paints a very exciting future for big data databases.

Edd Dumbill:

Whether you use Oracle or not, today’s announcement moves the big data world forward. We have de facto agreement on Hadoop and R as core infrastructure, and we have healthy competition at the database and NoSQL layer.

Max Schireson (10gen):

In my opinion this is a good thing for alternative database vendors. Competition is already thriving in the sector and I don’t think one more competitor, even one as large as Oracle, will alter the  dynamics dramatically. But many customers will take Oracle’s arrival in the space as a sign that this trend is significant and it is a space they should look at. If Oracle’s offering is strong, we may lose some market share to them, but their presence will make it a bigger market.

Klint Finley:

One of the big issues at play here is whether enterprises want expensive Oracle appliances, open core software running on commodity hardware or pay-as-you-go public cloud services. As Wikibon analyst Jeff Kelley notes, “Ellison knows Oracle needs to have some Hadoop/NoSQL offering, but the open source/commodity hardware/scale-out approach to Big Data is the antithesis of the Oracle way: closed source/Sun-only hardware/scale-up.”

French Caldwell (Gartner):

Got big data problems?  Got cloud angst?   Just put all your worries in a big iron box.  At least that’s what I took away after two hours of keynotes from Oracle and EMC executives this morning.   Big data and the cloud are euphemisms for huge information management and business challenges, but listening to the keynotes, you’d think it’s just a technical problem.  The proliferation of vast amounts of unstructured content and a revolution in IT provisioning models, and even digital dependent revenue streams are not issues to be trifled with.  But at the opening of Open World, the dumbing down of these challenges is exactly what happened.  The vision communicated is that the solution is that you can put it all in a big data box, or a BI machine.


Ashok Bindra:

According to Oracle, the Big Data Appliance is a new system that includes an open source distribution of Apache Hadoop, Oracle NoSQL Database, Oracle Data Integrator Application Adapter for Hadoop, Oracle Loader for Hadoop, and an open source distribution of R.


My predictions turned true. Almost all.

Original title and link: Oracle Big Data Appliance Roundup: What, Why, How (NoSQL database©myNoSQL)

Cloudera Hadoop Distribution on Dell's Commodity Servers


On the hardware side, the package can come with either Dell PowerEdge C2100, C6100 or C6105 servers. The PowerEdge C-series servers are uniquely suited for Hadoop’s multiserver deployments because of their modest physical size and power usage, […] A deployment based on the reference architecture could scale from six nodes to 720 nodes.


The cost of a minimum configuration would run from US$118,000 to $124,000, depending on the support options.

what’s that definition of commodity servers again?

Original title and link: Cloudera Hadoop Distribution on Dell’s Commodity Servers (NoSQL database©myNoSQL)


SAP HANA: In-Memory Analytical Appliance

Dennis Moore:

SAP HANA does manage data in memory, for nearly incredible performance in some applications, but it also manages to persist that data on disk, making it suitable for analytical applications and transactional applications – simultaneously.

SAP HANA architecture

The architecture diagram above doesn’t show anything uncommon: a good ecosystem and a (pretty classical?) storage engine with an in-memory layer—the Calc Engine and MDX support are not present though in a relational database engine.

But here is the problem:

In the short-term, it seems that SAP still struggles to generate references for HANA, other than in a narrow set of custom data-warehouse-type analytics.


When HANA is generally available […]

The way I read it is: even with selected clients HANA doesn’t seem to provide the promised value. The real question is why? Isn’t it cost effective? Doesn’t HANA bring enough innovation to solve real problems? Is the in-memory layer not enough for addressing the range of problems HANA is promising to solve? Is the competition providing better or more effective solutions?

Original title and link: SAP HANA: In-Memory Analytical Appliance (NoSQL database©myNoSQL)


IBM Launches First Netezza Appliance

The pitch:

The IBM® Netezza High Capacity Appliance extends IBM Netezza’s family of data warehouse appliances to new extremes of data capacity, scaling to multiple petabytes of user data. This will enable organizations to meet a variety of analytical and historical data storage requirements with a single cost-effective appliance.

The reason for posting about it is this price information from the ZDNet announcement :

The big pitch for Netezza is the price per user per terabyte[1]. Mills said the Netezza appliance will run about $2,500 per user per terabye compared to an average of $10,000.

  1. My emphasis.  

Original title and link: IBM Launches First Netezza Appliance (NoSQL database©myNoSQL)