NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



Exadata: All content tagged as Exadata in NoSQL databases and polyglot persistence

The Market for Products Like HANA and Exadata

Larry Dignan summarizing a research by a Cowen analyst:

We believe the trend in compute is massively distributed commodity boxes, and while there is a market for products like HANA and Exadata, it is significantly smaller than optimistic descriptions by SAP and Oracle. We believe HANA revenue is being inflated by subjective product revenue allocation to HANA at the expense of its traditional Apps and BI businesses, and swapping HANA for unused licenses. We think this inflates seat count with no incremental cash to the firm. At some point investors will likely start to worry about the implications on the other 90% of license sales we expect will eventually turn negative. We believe the offense in this case is getting into the data management business with old product (Sybase) and high priced hardware (HANA). While the impact on the model is low, the reputational cost to management could be high.

I have to confess that I’ve never really understood what these estimations are built upon. Maybe if I got this, I’d start calling myself an analyst.

Original title and link: The Market for Products Like HANA and Exadata (NoSQL database©myNoSQL)


Counting Triangles Smarter (Or How to Beat Big Data Vendors at Their Own Game)

Davy Suvee showing that Datablend’s custom datastore could deliver better performance than generic solutions like Hadoop, Vertica, or ExaData:

Although Vertica and Oracle’s results are impressive, they require a significant hardware setup of 4 nodes, each containing 96GB of RAM and 12 cores. My challenge: beating the Big Data vendors at their own game by calculating triangles through a smarter algorithm that is able to deliver similar performance on commodity hardware (i.e. my MacBook Pro Retina).

Considering the size of the data (86mil. relationships), I wonder what the result would be using a graph database like Neo4j. Anyone up for testing it?

Original title and link: Counting Triangles Smarter (Or How to Beat Big Data Vendors at Their Own Game) (NoSQL database©myNoSQL)


The Oracle NoSQL Database and Big Data Appliance

There’s been a lot of speculation about the announcements coming from Oracle’s OpenWorld event. A first part was revealed during the keynote in the form of an in-memory analytics appliance called Exalytics [2]. But there’s talk about a Big Data Appliance and an Oracle NoSQL database.

Here’re my predictions[1]

  1. Oracle became very aggressive in selling products based on hardware, software, and services. So they’ll announce a Hadoop appliance integrated with an existing Oracle product. It could be either the Oracle Exadata or even the newly announced Exalytics.

    This appliance will place Oracle in competition with all other Hadoop appliance sellers: EMC, NetApp, IBM. Also these days most of the analytics databases try to integrate with Hadoop.

  2. Oracle already has a couple of non-relational solutions in their portfolio: BerkleyDB, TimesTen, Coherence. And they’ve already started to test the NoSQL market by announcing the MySQL and MySQL Cluster NoSQL hybrid systems.

    I don’t expect Oracle NoSQL database to be a new product. Just a rebranding or repackaging of one of the above mentioned ones. Probably the TimesTen.

  3. Oracle will invest more into integrating its line of products with Hadoop. Having both a Hadoop and an in-memory analytics appliance will make them very competitive in this space.

  4. Oracle will extend the support for NoSQLish interfaces (memcached) to its other database products.

What are your predictions?

  1. or speculations  

  2. I’m currently gathering more details about Exalytics.  

Original title and link: The Oracle NoSQL Database and Big Data Appliance (NoSQL database©myNoSQL)

Will Oracle Win the NoSQL Competition

I agree this title is misleading but problem is clear: today Oracle does not provide any product can compete with new cloud computing needs and with the NoSQL movement. It is not possibile to think that actually the RAC technology of oracle can be used in a cloud environment and also a cloud service cannot be deployed over an Exadata.

The real question though is if Oracle is really interested by the market currently served by NoSQL databases and/or hybrid solutions. And judging by the latest versions of MySQL and MySQL Cluster[1] it looks like they are testing the waters.

  1. Latest versions of MySQL and MySQL Cluster are adding support for using the Memcached protocol. See NoSQL to MySQL with Memcached  

Original title and link: Will Oracle Win the NoSQL Competition (NoSQL database©myNoSQL)


Enterprise Big Data Stack vs Open Source Big Data Stack

Goldmacher estimated that YouTube consumption—user uploads of 48 hours of video a minute and 3 billion videos a day along with roughly 45 petabytes of viewed videos a day—would require at least 9 full-rack Exadata machines at $1.5 million each. There would be at least 18 Exadata machines to handle spikes. Those machines would add up to 14 Exalogic devices to serve data at $1.1 million per system. The software stack under Oracle would include WebLogic middleware, Oracle databases, Exadata optimized storage and Oracle as operating system. The open source comparison included JBoss middleware, MySQL, Hadoop and Red Hat Enterprise Linux as the OS.

Big Data Enterprise Stack

Big Data Open Source Stack

Credit Peter Goldmacher (Cowen & Co. analyst)

Two comments (the only I have):

  1. what advantages would the enterprise stack offer to justify a 5x cost?
  2. in case all numbers are completely wrong, what’s the advantage of the enterprise stack?

Original title and link: Enterprise Big Data Stack vs Open Source Big Data Stack (NoSQL database©myNoSQL)


Types of Big Data Work

Mike Minelli: Working with big data can be classified into three basic categories […] One is information management, a second is business intelligence, and the third is advanced analytics

Information management captures and stores the information, BI analyzes data to see what has happened in the past, and advanced analytics is predictive, looking at what the data indicates for the future.

There’s also a list of tools for BigData: AsterData (acquired by Teradata), Datameer, Paraccel, IBM Netezza, Oracle Exadata, EMC Greenplum.

Original title and link: Types of Big Data Work (NoSQL databases © myNoSQL)