ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

Oracle: All content tagged as Oracle in NoSQL databases and polyglot persistence

Nokia’s Big Data Ecosystem: Hadoop, Teradata, Oracle, MySQL

Nokia’s big data ecosystem consists of a centralized, petabyte-scale Hadoop cluster that is interconnected with a 100-TB Teradata enterprise data warehouse (EDW), numerous Oracle and MySQL data marts, and visualization technologies that allow Nokia’s 60,000+ users around the world tap into the massive data store. Multi-structured data is constantly being streamed into Hadoop from the relational systems, and hundreds of thousands of Scribe processes run every day to move data from, for example, servers in Singapore to a Hadoop cluster in the UK. Nokia is also a big user of Apache Sqoop and Apache HBase.

In the coming years you’ll hear more often stories—sales pitches—about single unified platforms solving all these problems at once. But platforms that will survive and thrive are those that will accomplish two things:

  1. keep the data gates open: in and out.
  2. work with different other platform to make this efficiently for users

Original title and link: Nokia’s Big Data Ecosystem: Hadoop, Teradata, Oracle, MySQL (NoSQL database©myNoSQL)

via: http://blog.cloudera.com/blog/2013/04/customer-spotlight-nokias-big-data-ecosystem-connects-cloudera-teradata-oracle-and-others/


Wikipedia Migrates to MariaDB... but facts are facts

Jon Buys:

There was, and continues to be, concern over Oracle’s treatment of the open source competitor to their own Oracle database. I personally have wondered what motivation, if any, Oracle has to maintain MySQL. They may simply be milking the revenue stream created by MySQL AB until the well goes dry. Since MariaDB is surpassing MySQL in performance and community goodwill, that day may come sooner rather than later.

A couple of little known things:

  1. Oracle has been house for InnoDB since 2005. InnoDB was and continues to be the default, recommended engine for MySQL. Before and after Oracle acquired MySQL through Sun Microsystems.
  2. Oracle has been house for Sleepycat’s BerkleyDB since 2006. Those products are definitely not dead. Community-wise maybe they haven’t put much effort into extending it.

Facts are facts.

Original title and link: Wikipedia Migrates to MariaDB… but facts are facts (NoSQL database©myNoSQL)

via: http://ostatic.com/blog/wikipedia-migrates-to-mariadb


Oracle and DataStax on TechCrunch

I have some serious doubts about Alex Williams’s post on TechCrunch about the connection between the recently announced results from Oracle and DataStax. To exemplify, these paragraphs don’t make a lot of sense to me:

The reason for the drop has more to do with the enterprise acceptance of online applications more than anything else, said Datatastax CEO Billy Bosworth in an interview last week.

Does it mean that enterprises are discovering online applications now?

When companies come to Datastax, they say the number one thing they need is security, Bosworth said. They are building from day one to avoid disaster scenarios.

DataStax introduced security features just recently, so I’ll assume Billy Bosworth was actually referring to fault tolerance and resilience. What ended up in the article is a different story.

Datastax has its own challenges. It competes with Amazon Web Services and all the other NoSQL providers such as 10gen.

Once again I’ll assume the author wanted to refer to Amazon Dynamo (and RDS?), but thought it’ll read better as “Amazon Web Services”.

Actually, now that I read it twice, I realize that I shouldn’t link to it. But at least I can suggest you to waste no time with it.

Original title and link: Oracle and DataStax on TechCrunch (NoSQL database©myNoSQL)

via: http://techcrunch.com/2013/03/24/oracle-is-bleeding-at-the-hands-of-database-rivals/


Oracle Paper: The Cost of Do-It-Yourself Hadoop vs Oracle Big Data Appliance

Based on ESG’s modeling of a medium-sized Hadoop-oriented big data project, the preconfigured Oracle Big Data Appliance is 39% less costly than a “build” equivalent do-it-yourself infrastructure. And using Oracle Big Data Appliance will cut the project length by about one-third. For most enterprises planning to take big data beyond experimentation and proof-of- concept, ESG suggests skipping the idea of in-house development, on-going management, and expansion of your own big data infrastructure, to instead look to purpose-built infrastructure solutions such as Oracle Big Data Appliance.

This is an extract from Oracle’s whitepaper “Getting Real about Big Data: Build Versus Buy“. It’s a nice reading excercise to better understand how the database leader is positioning their Oracle Big Data Appliance compared to Hadoop’s commodity-hardware cluster.

I’d love seeing the equivalent paper from Hortonworks1.


  1. The only reason I’m referring directly to Hortonworks and not also Cloudera is that the Hadoop part of Oracle Big Data Appliance is offered by Cloudera

Original title and link: Oracle Paper: The Cost of Do-It-Yourself Hadoop vs Oracle Big Data Appliance (NoSQL database©myNoSQL)


DataStax's Reaction to MySQL 5.6: Oracle’s MySQL Misses the NoSQL Mark

Jonathan Ellis in a post about MySQL 5.6 and how Oracle got the whole NoSQL wrong, considering NoSQL is, in this exact order, about scaling, continuous availability, flexibility, performance, and queryability:

The big news for MySQL 5.6 was the inclusion of “NoSQL” features in the form of a memcached api for get and put operations.

In cases like this, it’s tough to tell whether Oracle got this so wrong deliberately to sow confusion in the market, or because they really think that’s what NoSQL is about.

I know Jonathan Ellis has always had very strong opinions about the technical superiority of Cassandra and Cassandra is indeed a very solid solution, but I’m always reluctant to calling a competitor stupid and using the myopic argument “if I’m good at X and suck at Y, then what everyone is looking for is only X”.

Original title and link: DataStax’s Reaction to MySQL 5.6: Oracle’s MySQL Misses the NoSQL Mark (NoSQL database©myNoSQL)

via: http://www.datastax.com/dev/blog/oracles-mysql-misses-the-nosql-mark


How Do I Freaking Scale Oracle?

Andrew Oliver for InfoWorld:

That said, many companies I work with have spent 20 years painting themselves into an Oracle corner. While they may have one eye on a brighter future, they still must ensure their Oracle database is high-performance and highly available — and scales as well as possible. Despite what you may read in NoSQL vendor marketing materials (or even in my blog [2]), it is possible to scale Oracle.

If you can actually find the answer to the question in the title, please teach me or give me links. This article left me in the dark. Or the only light I’ve seen involves way too many additional products that most probably cost a bit.

Original title and link: How Do I Freaking Scale Oracle? (NoSQL database©myNoSQL)

via: http://www.infoworld.com/print/212392


Main Features of In-Memory Data Grids

Good article about In-Memory Data Grids on Cubrid’s blog by Ki Sun Song.

The features of IMDG can be summarized as follows:

  • Data is distributed and stored in multiple servers.
  • Each server operates in the active mode.
  • A data model is usually object-oriented (serialized) and non-relational.
  • According to the necessity, you often need to add or reduce servers.

Even if you don’t read it all, but plan to use an IMDG solution, the first two questions you want to ask your vendor are: what the approach you are proposing to deal with the limited memory capacity and what’s the strategy for reliability. You’ll get good answers from well established products, but these answers are not necessarily the ones that provide the exact requirements your solution will need.

Original title and link: Main Features of In-Memory Data Grids (NoSQL database©myNoSQL)

via: http://www.cubrid.org/blog/dev-platform/introduction-to-in-memory-data-grid-main-features/


Moving Data From Oracle to MongoDB : Bridging the Gap With JRuby

A homegrown ETL process for migrating data from Oracle to MongoDB based on JRuby chameleonic capabilities: a Ruby implementation integrating well in a Java environment:

Rather than having to re-map one database or the other in the other persistence technology to facilitate the ETL process (not DRY), JRuby allowed the two persistence technologies to interoperate. By utilizing JRuby’s powerful embedding capabilities, we were able to read data out of Oracle via Hibernate and write data to MongoDB via MongoMapper.

Original title and link: Moving Data From Oracle to MongoDB : Bridging the Gap With JRuby (NoSQL database©myNoSQL)

via: http://blog.jruby.org/2012/05/bridging-the-gap-with-jruby/


MariaDB 5.5 Connection Thread Pool

MariaDB-5.5.21-beta is the first MariaDB release featuring the new thread pool. Oracle offers a commercial thread pool plugin for MySQL Enterprise, but now MariaDB brings a thread pool implementation to the community!

I’ve checked the timestamp of the post and knowledge base article three times. It is 2012.

Original title and link: MariaDB 5.5 Connection Thread Pool (NoSQL database©myNoSQL)


Teradata and Hortonworks Partnership and What It Means

Context

Teradata sells software, hardware, and services for data warehouses and analytic applications. Part of the Teradata portfolio is also the Teradata Aster MapReduce Platform a massively parallel processing infrastructure with a software solution that embeds both SQL and MapReduce analytic processing for deeper analytic insights on multi-structured data and new analytic capabilities driven by data science.

Hortonworks offers services around the 100% Apache-licensed, open source Hortonworks Data Platform, an integrated solution built around Hadoop.

Hortonworks Data Platform

Announcement

The interesting bits from the announcement and media coverage:

News release:

Teradata and Hortonworks will join forces to provide technologies and strategic guidance to help businesses build integrated, transparent, enterprise-class big data analytic solutions that leverage Apache Hadoop. The partnership will focus on enabling businesses to use Apache Hadoop to harness the value from new sources of data. Businesses will be able to quickly load and refine multi-structured data, some of which is being discarded today, for discovery and analytics. The resulting insights will enable analysts and front line users to make the best business decision possible.

Teradata Hortonworks Hadoop Aster Architecture

For example, each day websites generate many terabytes of raw, complex data about customers’ viewing and buying habits. These web logs can be directly loaded into Teradata Aster or Apache Hadoop where they can be stored, transformed, and refined in preparation for analysis by the Teradata Aster MapReduce platform (nb: my emphasis).

Derrick Harris:

The company [Teradata] has already worked with Hortonworks’ competitor Cloudera on a connector between the Teradata Database and Cloudera’s Hadoop distribution, but the Hortonworks deal appears a little deeper and more strategic.

Quentin Hardy:

The alliance between Teradata and Hortonworks means that companies can get strategic advice about how to get into the new analytics game from Teradata, and have practical help on running the systems from Hortonworks.

Arun Murthy:

However, there are two important challenges that need to be addressed before broad enterprise adoption can occur:

  • Understanding the right use cases in which to utilize Apache Hadoop.
  • Integrating Apache Hadoop with existing data architectures in an appropriate manner to get better value from existing investments.

My sense of excitement about the Teradata/Hortonworks partnership is amplified by the fact that it addresses these two core challenges for Apache Hadoop:

  • We will be rolling out a reference architecture that provides guidance to enterprises that want to understand the best use cases for which to apply Hadoop. As part of that, we will be helping Teradata customers use Hadoop in conjunction with their Teradata and Teradata Aster analytic data solutions investments.
  • We will also be working closely with the Teradata engineering teams on jointly engineered solutions that optimize the integration points with Apache Hadoop.

Commentary

  • From Hortonworks perspective this deal is weaker than the Oracle-Cloudera deal.

    In the former case, new Teradata sales do not necessary result in new Hortonworks Data Platform installations, while in the case of the Oracle-Cloudera partnership, every sale results in a new business for Cloudera.

  • From Teradata perspective, this partnership gives them a perfect answer and solution for clients asking about unstructured data scenarios.

  • The announcement is slightly positioning Hadoop as part of ETL process, but is not as strict about this as other Hadoop integration architectures—see Netezza and Hadoop and Vertica and Hadoop.

  • Depending on the level of integration the two team will pull together, this partnership might result in one of the most complete and powerful structured and unstructured data warehouse and analytics platform.

I’m looking forward to seeing the proposed architecture blueprint once it’s finalized.

Links

Original title and link: Teradata and Hortonworks Partnership and What It Means (NoSQL database©myNoSQL)


The time for NoSQL is now

Andrew C. Oliver:

The transition to NoSQL databases will take time. We still don’t have TOAD, Crystal Reports, query language standardization and other essential tools needed for mass adoption. There will be missteps (i.e. I may need a different type of database for reporting than for my operational system), but I truly think this is one technology that isn’t just marketing.

This coming from someone that was happy to discover back in 1998 all the knobs in Oracle.

Original title and link: The time for NoSQL is now (NoSQL database©myNoSQL)

via: http://osintegrators.com/node/76


12 Hadoop Vendors to Watch in 2012

My list of 8 most interesting companies for the future of Hadoop didn’t try to include anyone having a product with the Hadoop word in it. But the list from InformationWeek does. To save you 15 clicks, here’s their list:

  • Amazon Elastic MapReduce
  • Cloudera
  • Datameer
  • EMC (with EMC Greenplum Unified Analytics Platform and EMC Data Computing Appliance)
  • Hadapt
  • Hortonworks
  • IBM (InfoSphere BigInsights)
  • Informatica (for HParser)
  • Karmasphere
  • MapR
  • Microsoft
  • Oracle

Original title and link: 12 Hadoop Vendors to Watch in 2012 (NoSQL database©myNoSQL)