DataStax: All content tagged as DataStax in NoSQL databases and polyglot persistence
Hey, it looks like the NoSQL applications panel I’ve moderated at QCon SF 2011 went live minutes ago on InfoQ. Featuring Andy Gross (Basho), Frank Weigel (Couchbase), Matt Pfeil (DataStax), Michael Stack (StumbleUpon), Jared Rosoff (10gen), and yours truly.
It misses my opening jokes though ↩
Original title and link: NoSQL Applications Panel Video ( ©myNoSQL)
Filtering and augmenting a Q&A on Quora:
- Cloudera: Hadoop distribution, Cloudera Enterprise, Services, Training
- Hortonworks: Apache Hadoop major contributions, Services, Training
- MapR: Hadoop distribution, Services, Training
- HPCC Systems: massive parallel-processing computing platform
- HStreaming: real-time data processing and analytics capabilities on top of Hadoop
- DataStax: DataStax Enterprise, Apache Cassandra based platform accepting real-time input from online applications, while offering analytic operations, powered by Hadoop
- Zettaset: Enterprise Data Analytics Suite built on Hadoop
- Hadapt: analytic platform based on Apache Hadoop and relational DBMS technology
I’ve left aside names like IBM, EMC, Informatica, which are doing a lot of integration work.
Original title and link: 8 Most Interesting Companies for Hadoop’s Future ( ©myNoSQL)
- DataStax Brisk: Hadoop and Hive on Cassandra
- NetApp Hadoop Shared DAS
increase throughput in Hadoop clusters via its ConnectX-2 adapters with Hadoop Direct
SnapReduce transforms SnapLogic data integration pipelines directly into MapReduce tasks, making Hadoop processing much more accessible and resulting in optimal Hadoop cluster utilization.
Greenplum HD combines the Hadoop analytics platform with Greenplum’s database technology.
Ways to look at it:
- 2 large corporations getting into Hadoop
- 2 software solutions, 3 hardware solutions
- 1 open source project, 4 commercial products or
- 4 companies wanting to make a profit from Hadoop without contributing back to the community
Original title and link: Hadoop Ecosystem: EMC, NetApp, Mellanox, SnapLogic, DataStax (NoSQL databases © myNoSQL)
According to the official documentation, Brisk key advantages:
- No single point of failure
- streamlined setup and operations
- analytics without ETL
- full integration with DataStax OpsCenter
Now that’s a title: The Brangelina of Big Data: Cassandra mates with Hadoop. Open source celebrity supercouple. The article is a genealogy tree: Hadoop, Hive, Cassandra, DataStax.
I just heard the announcement DataStax, the company offering Cassandra services, made about Brisk a Hadoop and Hive distribution built on top of Cassandra:
Brisk provides integrated Hadoop MapReduce, Hive and job and task tracking capabilities, while providing an HDFS-compatible storage layer powered by Cassandra.
Brisk was announced officially during the MapReduce panel at Structure Big Data event. But it looks like others have already had a chance to hear about Brisk — is there something that I should be doing to hear the “unofficial” announcements?
DataStax has also made available a whitepaper: “Evolving Hadoop into a Low-Latency Data Infrastructure: Unifying Hadoop, Hive and Apache Cassandra for Real-time and Analytics” that you can download from here
DataStax OpsCenter for Apache Cassandra
DataStax (ex-Riptano) announced yesterday their tool for managing including sophisticated visualizations of the cluster, comprehensive management and configuration, monitoring and operating enterprise Cassandra applications named OpsCenter.
DataStax OpsCenter for Apache Cassandra will require a subscription, but a developer version, not to be used in production, will be made available too.
Call me an idealist, but I would have suggested a different than Gold/Silver/Bronze or Mission-Critical/Premier model:
- 1-5 nodes: free (nb: good kharma)
- 6-low tens of nodes: moderately priced package
- premier: everything else
EMC Greenplum Community Edition
After acquiring Greenplum, EMC is making available a community edition:
[…] the new EMC Greenplum Community Edition removes the cost barrier to entry for big data power tools empowering large numbers of developers, data scientists, and other data professionals. This free set of tools enables the community to not only better understand their data, gain deeper insights and better visualize insights, but to also contribute and participate in the development of next-generation tools and solutions. With the Community Edition stack, developers can build complex applications to collect, analyze and operationalize big data leveraging best of breed big data tools including the Greenplum Database with its in-database analytic processing capabilities.
I couldn’t find the details of the community edition license, but instead I’ve found this:
The software is only intended for research, development and experiments, with license purchases required for commercial uses.
About the (marketing) rationale behind this release you can read more on Chuck Hollis’, Global Marketing CTO, blog