Brisk: All content tagged as Brisk in NoSQL databases and polyglot persistence
Martin Schneider (Basho) trying to answer the question in the title:
Riak can be a data store to a purpose-built enterprise app; a caching layer for an Internet app, or part of the distributed fabric and DNA of a Global app. Those are of course highly arbitrary and vague examples, but it shows how flexible Riak is as a platform.
“Can be” is not quite equivalent with being the right solution and less so with being the best solution. And Martin’s answer to this is:
For super scalable enterprise and global apps — those where the data inside is inherently valuable and dependability of the system to capture, process and store data/writes is imperative — well I see Riak outperforming any perceived competitor in the space in providing value here.
But even for these scenarios, there’s competition from solutions like Cassandra, HBase, and Hypertable — the whole spectrum of scalable storage solutions based on Google BigTable and Amazon Dynamo being covered: HBase (a BigTable implementation), Cassandra (a solution using the BigTable data model and the Dynamo distributed model), and Riak (a solution based mainly on the Amazon Dynamo paper).
While Riak presents itself as the cleanest Dynamo based solution, I would venture to say that both Cassandra and HBase come to table with some interesting characteristics that cannot be ignored:
- Strong communities and community driven development processes — both HBase and Cassandra are top Apache Foundation projects
- Excellent integration with Hadoop, the leading batch processing solution. DataStax, the company offering services for Cassandra, went the extra-mile of creating a custom Hadoop solution, Brisk, making this integration even better.
Bottom line, I don’t think we can declare a winner in this space and I believe all three solutions will stay around for a while competing for every scenario requiring dependability of the system to capture, process and store data.
- DataStax Brisk: Hadoop and Hive on Cassandra
- NetApp Hadoop Shared DAS
increase throughput in Hadoop clusters via its ConnectX-2 adapters with Hadoop Direct
SnapReduce transforms SnapLogic data integration pipelines directly into MapReduce tasks, making Hadoop processing much more accessible and resulting in optimal Hadoop cluster utilization.
Greenplum HD combines the Hadoop analytics platform with Greenplum’s database technology.
Ways to look at it:
- 2 large corporations getting into Hadoop
- 2 software solutions, 3 hardware solutions
- 1 open source project, 4 commercial products or
- 4 companies wanting to make a profit from Hadoop without contributing back to the community
Original title and link: Hadoop Ecosystem: EMC, NetApp, Mellanox, SnapLogic, DataStax (NoSQL databases © myNoSQL)
According to the official documentation, Brisk key advantages:
- No single point of failure
- streamlined setup and operations
- analytics without ETL
- full integration with DataStax OpsCenter
Now that’s a title: The Brangelina of Big Data: Cassandra mates with Hadoop. Open source celebrity supercouple. The article is a genealogy tree: Hadoop, Hive, Cassandra, DataStax.
I just heard the announcement DataStax, the company offering Cassandra services, made about Brisk a Hadoop and Hive distribution built on top of Cassandra:
Brisk provides integrated Hadoop MapReduce, Hive and job and task tracking capabilities, while providing an HDFS-compatible storage layer powered by Cassandra.
Brisk was announced officially during the MapReduce panel at Structure Big Data event. But it looks like others have already had a chance to hear about Brisk — is there something that I should be doing to hear the “unofficial” announcements?
DataStax has also made available a whitepaper: “Evolving Hadoop into a Low-Latency Data Infrastructure: Unifying Hadoop, Hive and Apache Cassandra for Real-time and Analytics” that you can download from here