DataStax: All content tagged as DataStax in NoSQL databases and polyglot persistence
Wednesday, 11 May 2011
Hadoop Ecosystem: EMC, NetApp, Mellanox, SnapLogic, DataStax
GigaOm and RWW have coverage of the 5 Hadoop-related announcements:
- DataStax Brisk: Hadoop and Hive on Cassandra
- NetApp Hadoop Shared DAS
-
Mellanox Hadoop-Direct
increase throughput in Hadoop clusters via its ConnectX-2 adapters with Hadoop Direct
-
SnapLogic SnapReduce
SnapReduce transforms SnapLogic data integration pipelines directly into MapReduce tasks, making Hadoop processing much more accessible and resulting in optimal Hadoop cluster utilization.
-
EMC GreenplumHD
Greenplum HD combines the Hadoop analytics platform with Greenplum’s database technology.
Ways to look at it:
- 2 large corporations getting into Hadoop
- 2 software solutions, 3 hardware solutions
- 1 open source project, 4 commercial products or
- 4 companies wanting to make a profit from Hadoop without contributing back to the community
Original title and link: Hadoop Ecosystem: EMC, NetApp, Mellanox, SnapLogic, DataStax (NoSQL databases © myNoSQL)
Monday, 9 May 2011
DataStax Hadoop on Cassandra Brisk Released
DataStax kept its promise and released Brisk: the Hadoop and Hive distribution using Cassandra, also known as Brangelina.
According to the official documentation, Brisk key advantages:
- No single point of failure
- streamlined setup and operations
- analytics without ETL
- full integration with DataStax OpsCenter

Useful links:
Original title and link: DataStax Hadoop on Cassandra Brisk Released (NoSQL databases © myNoSQL)
Monday, 18 April 2011
Amazon EC2 Cassandra Cluster with DataStax AMI
This AMI does the following:
- installs Cassandra 0.7.4 on a Ubuntu 10.10 image
- configures emphemeral disks in raid0, if applicable (EBS is a bad fit for Cassandra
- configures Cassandra to use the root volume for the commitlog and the ephemeral disks for data files
- configures Cassandra to use the local interface for intra-cluster communication
- configures all Cassandra nodes with the same seed for gossip discovery
Note the “EBS is a bad fit for Cassandra”. That’s what Adrian Cockcroft explains in Multi-tenancy and Cloud Storage Performance.
Original title and link: Amazon EC2 Cassandra Cluster with DataStax AMI (NoSQL databases © myNoSQL)
via: http://www.datastax.com/dev/blog/setting-up-a-cassandra-cluster-with-the-datastax-ami
Wednesday, 23 March 2011
Brisk: The Brangelina of Big Data
Now that’s a title: The Brangelina of Big Data: Cassandra mates with Hadoop. Open source celebrity supercouple. The article is a genealogy tree: Hadoop, Hive, Cassandra, DataStax.
Original title and link: Brisk: The Brangelina of Big Data (NoSQL databases © myNoSQL)
Cassandra + Hadoop = Brisk by DataStax
I just heard the announcement DataStax, the company offering Cassandra services, made about Brisk a Hadoop and Hive distribution built on top of Cassandra:
Brisk provides integrated Hadoop MapReduce, Hive and job and task tracking capabilities, while providing an HDFS-compatible storage layer powered by Cassandra.
Brisk was announced officially during the MapReduce panel at Structure Big Data event. But it looks like others have already had a chance to hear about Brisk — is there something that I should be doing to hear the “unofficial” announcements?
DataStax has also made available a whitepaper: “Evolving Hadoop into a Low-Latency Data Infrastructure: Unifying Hadoop, Hive and Apache Cassandra for Real-time and Analytics” that you can download from here
Original title and link: Cassandra + Hadoop = Brisk by DataStax (NoSQL databases © myNoSQL)
Tuesday, 1 February 2011
New Tools in the NoSQL and Big Data Market
DataStax OpsCenter for Apache Cassandra
DataStax (ex-Riptano) announced yesterday their tool for managing including sophisticated visualizations of the cluster, comprehensive management and configuration, monitoring and operating enterprise Cassandra applications named OpsCenter.
DataStax OpsCenter for Apache Cassandra will require a subscription, but a developer version, not to be used in production, will be made available too.
Call me an idealist, but I would have suggested a different than Gold/Silver/Bronze or Mission-Critical/Premier model:
- 1-5 nodes: free (nb: good kharma)
- 6-low tens of nodes: moderately priced package
- premier: everything else
EMC Greenplum Community Edition
After acquiring Greenplum[1], EMC is making available a community edition:
[…] the new EMC Greenplum Community Edition removes the cost barrier to entry for big data power tools empowering large numbers of developers, data scientists, and other data professionals. This free set of tools enables the community to not only better understand their data, gain deeper insights and better visualize insights, but to also contribute and participate in the development of next-generation tools and solutions. With the Community Edition stack, developers can build complex applications to collect, analyze and operationalize big data leveraging best of breed big data tools including the Greenplum Database with its in-database analytic processing capabilities.
I couldn’t find the details of the community edition license, but instead I’ve found this:
The software is only intended for research, development and experiments, with license purchases required for commercial uses.
About the (marketing) rationale behind this release you can read more on Chuck Hollis’, Global Marketing CTO, blog
Original title and link: New Tools in the NoSQL and Big Data Market (NoSQL databases © myNoSQL)
Monday, 17 January 2011
Riptano Becomes DataStax
But they lost the rhino logo:
I broke the news of Riptano creation in April. And this is the second renaming in the industry.
Original title and link: Riptano Becomes DataStax (NoSQL databases © myNoSQL)
Most Popular Articles
- Translate SQL to MongoDB MapReduce
- Tutorial: Getting Started With Cassandra
- CouchDB vs MongoDB: An attempt for a More Informed Comparison
- Cassandra @ Twitter: An Interview with Ryan King
- A Couple of Nice GUI Tools for MongoDB
- NoSQL benchmarks and performance evaluations
- Ehcache: Distributed Cache or NoSQL Store?
- Document Databases Compared: CouchDB, MongoDB, RavenDB
- Quick Review of Existing Graph Databases
- NoSQL Data Modeling
