ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Membase Amazon SimpleDB MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

teradata: All content tagged as teradata in NoSQL databases and polyglot persistence

Explaining Hadoop to Your CEO

Dan Woods (Forbes):

The answer is, yes, Hadoop could be helpful, but there are other technologies as well. For example, technologies such as Splunk allow you to explore big data sets in a way that’s more interactive than most Hadoop implementations. Splunk not only lets you play with big data; you can also distill it and visualize it. Pervasive’s DataRush allows you to write parallel programs using a simplified programming model, and then process lots of data at scale. 1010data allows you to look at a spreadsheet that has a trillion rows, as well as handle time series data. EMC Greenplum and Teradata Aster Data and SAP HANA will also want a crack at your business. If you take any of these technologies and combine them with QlikView, Tableau, or TIBCO Spotfire, you can figure out what a big data set means to your business very quickly. So if your job is understanding the business value of the data, Hadoop is one of many things that you should analyze.

Translation:

Blah blah blah Big Data, blah blah blah list of vendors, blah blah blah Big Data

It might even work for a dummy CEO.

Original title and link: Explaining Hadoop to Your CEO (NoSQL database©myNoSQL)

via: http://www.forbes.com/sites/danwoods/2011/11/03/explaining-hadoop-to-your-ceo/


Hadoop: It's Still a Niche Technology

In an otherwise generic but interesting post about Hadoop and its integration with data analytics and data warehouse solutions, Jessica Twentyman writes:

It’s still a niche technology, but Hadoop’s profile received a serious boost over that past year, thanks in part to start-up companies such as Cloudera and MapR that offer commercially licensed and supported distributions of Hadoop. Its growing popularity is also the result of serious interest shown by EDW vendors like EMC, IBM and Teradata. EMC bought Hadoop specialist Greenplum in June 2010; Teradata announced its acquisition of Aster Data in March 2011; and IBM announced its own Hadoop offering, Infosphere, in May 2011.

Unfortunately she got this all wrong. It is the open source community, developers, data scientists, and Cloudera that help popularize Hadoop.

These data analytics and data warehouse vendors are just capitalizing on Hadoop delivering results. They haven’t been knocking at doors asking: “Have you heard of Hadoop? Do you want to try it?”. They’ve run into Hadoop in most of the places they went and that made them realize it is a business opportunity.

So, I’ll say it again: Hadoop is popular thanks to the open source community, developers, data scientists and Cloudera.

Original title and link: Hadoop: It’s Still a Niche Technology (NoSQL database©myNoSQL)

via: http://searchdatamanagement.techtarget.co.uk/feature/Hadoop-for-big-data-puts-architects-on-journey-of-discovery


Big Data Is Going Mainstream: Facebook, Yahoo!, eBay, Quantcast, and Many Others

Shawn Rogers has a short but compelling list of Big Data deployments in his article Big Data is Scaling BI and Analytics. This list also shows that even if there are some common components like Hadoop, there are no blueprints yet for dealing with Big Data.

  • Facebook: Hadoop analytic data warehouse, using HDFS to store more than 30 petabytes of data. Their Big Data stack is based only on open source solutions.

  • Quantcast: 3,000 core, 3,500 terabyte Hadoop deployment that processes more than a petabyte of raw data each day

  • University of Nebraska-Lincoln: 1.6 petabytes of physics data Hadoop cluster

  • Yahoo!: 100,000 CPUs in 40,000 computers, all running Hadoop. Also running a 12 terabyte MOLAP cube based on Tableau Software

  • eBay: has 3 separate analytics environments:

    • 6PB data warehouse for structured data and SQL access
    • 40PB deep analytics (Teradata)
    • 20PB Hadoop system to support advanced analytic workload on unstructured data

Original title and link: Big Data Is Going Mainstream: Facebook, Yahoo!, eBay, Quantcast, and Many Others (NoSQL database©myNoSQL)


Aster Data SQL-MapReduce Technology Patent

From a Teradata PR announcement:

SQL-MapReduce® is a framework which enables fast, investigative analysis of complex information by data scientists and business analysts. It enables procedural expressions in software languages (such as Java, C#, Python, C++, and R) to be parallelized across a group of linked computers (compute cluster) and then activated for use (invoked) with standard SQL.  

The closest open source solution I can think of is Pig , created and open sourced by Yahoo! (PDF).

Original title and link: Aster Data SQL-MapReduce Technology Patent (NoSQL database©myNoSQL)


2 Ways to Tackle Really Big Data

So there you have the two approaches to handling machine-generated-data. If you have vast archives, EMC, IBM Netezza, and Teradata all have purpose-build appliances that scale into the petabytes. You also could use Hadoop, which promises much lower cost, but you’ll have to develop separate processes and applications for that environment. You’ll also have to establish or outsource expertise on Hadoop deployment, management, and data processing. For fast-query needs, EMC, IBM Netezza, and Teradata all have fast, standard appliances and faster, high-performance appliances (and companies including Kognitio and Oracle have similar configuration choices). Column-oriented database and appliance vendors including HP Vertica, InfoBright, ParAccel, and Sybase have speed advantages inherent in their database architectures.

I’m wondering why Hadoop is mentioned just in passing considering how many large datasets it is already handling.

Original title and link: 2 Ways to Tackle Really Big Data (NoSQL database©myNoSQL)

via: http://www.informationweek.com/news/software/info_management/231000314?cid=RSSfeed_IWK_Business_Intelligence


The Data Processing Platform for Tomorrow

In the blue corner we have IBM with Netezza as analytic database, Cognos for BI, and SPSS for predictive analytics. In the green corner we have EMC with Greenplum and the partnership with SAS[1]. And in the open source corner we have Hadoop and R.

Update: there’s also another corner I don’t know how to color where Teradata and its recently acquired Aster Data partner with SAS.

Who is ready to bet on which of these platforms will be processing more data in the next years?


  1. GigaOm has a good article on this subject here  

Original title and link: The Data Processing Platform for Tomorrow (NoSQL databases © myNoSQL)


Cloudera: A Business Inteligence Leader

The Informatica accord is Cloudera’s second partnership this year with a leading DI player. Back in August, Cloudera cemented a deal with open source software (OSS) data integration (DI) specialist Talend. It also has partnerships with Teradata Corp., the former Netezza Inc., the former Greenplum Software Corp., Aster Data Systems Inc., Vertica Inc., and Pentaho.

One thing’s for sure: Cloudera is certainly attracting attention.

The strategy is surprisingly simple: make it easy to put data in and get it out.

Original title and link: Cloudera: A Business Inteligence Leader (NoSQL databases © myNoSQL)

via: http://tdwi.org/articles/2011/02/16/cloudera-leader-bi-hadoop.aspx


Hadoop Spreading through Cloudera Parternships

Cloudera in its attempt to Hadoopize the world goes on partnership spree:

Many of you may have read about some of the recent announcements of partnerships between Cloudera and some of the leading data management software companies like Teradata, Netezza, Greenplum (EMC), Quest and Aster Data. We established these partnerships because Hadoop is increasingly serving as an open platform that many different applications and complimentary technologies work with. Our goal is to to make this as easy and as standardized as possible.

Checking the ☞ press release section turns out the following parnerships:

  • Membase
  • Talend
  • Quest
  • Pentaho
  • NTT Data
  • Aster Data
  • EMC Greenplum
  • Teradata
  • Netezza

Quite a few companies from the non-relational market.

Original title and link: Hadoop Spreading through Cloudera Parternships (NoSQL databases © myNoSQL)

via: http://www.cloudera.com/blog/2010/10/cdh3-beta-3-now-available/


Teradata, Cloudera team up on Hadoop data warehousing

In other words, Hadoop and data warehousing isn’t a zero sum game. The two techniques technologies will co-exist. Teradata will bundle a connector (the Teradata Hadoop Connector) to its systems with Cloudera Enterprise at no additional cost. Cloudera will provide support for the connector as part of its enterprise subscription. The two parties will also jointly market the connector.

That’s why we are saying NoSQL is just another tool in our toolbox.

Original title and link: Teradata, Cloudera team up on Hadoop data warehousing (NoSQL databases © myNoSQL)

via: http://www.zdnet.com/blog/btl/teradata-cloudera-team-up-on-hadoop-data-warehousing/39198


MySpace and Big Data

As you can imagine MySpace has to deal with huge amounts of data too, but they are doing it differently:

In its search for new data warehousing technology, the company evaluated technology from vendors including Teradata and Netezza. However, says Watters, “we didn’t think any of it could scale according to our needs.”

Instead it turned to Aster Data, whose “massively parallel” database technology is based on Google’s MapReduce distributed analytics engine[1]


  1. According to ☞ this: “Aster’s patent-pending analytics framework called SQL-MapReduce is unique to the Aster Data platform.”  ()

via: http://www.information-age.com/channels/information-management/it-case-studies/1267988/myspace-taps-big-data-for-turnaround.thtml