RainStor: All content tagged as RainStor in NoSQL databases and polyglot persistence
I don’t hear much about RainStor, the company that announced at the beginning of the year a solution promising impressive data compression rates, but it looks like they are signing some interesting partnerships. First it was with HP to include RainStor Database technology in HP Investigation Solution and now it’s Dell to include RainStor in Dell’s Big Data Retention solution.
A couple of words about RainStor:
- a massively parallel processing system
- shared everything architecture
- can be deployed on a standard RAID or HDFS
Original title and link: Dell Big Data Retention Solution Includes RainStor ( ©myNoSQL)
RainStor has announced the Big Data Analytics on Hadoop:
- The highest data compression in the industry with up to 40x reduction, compared to raw data typically stored in HDFS, with no re-inflation required for access
- The ability to run faster query and analysis using both SQL query and MapReduce with 10-100x faster results
- The ability to perform analytics directly in Hadoop, reducing the need to create copies and transfer data out
- Reduced nodes in a Hadoop cluster with ~85 percent lower operating costs.
A couple of comments:
- RainStor is not the only solution that can perform analytics directly in Hadoop
- Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL
- RainStor MapReduce support is via Pig
according to this, there’s an interesting aspect of RainStor support of SQL and MapReduce:
Users can choose SQL for rapid response ad-hoc queries or run batch jobs using MapReduce against RainStor data. Additionally you can interoperate SQL and MapReduce and join results from a query against RainStor and against native CSV files on HDFS.
As a side note, Toad for Cloud from Quest is a tool that tries to provide a table based perspective of data in relational and NoSQL databases
Anyways, the most interesting part of the announcement is RainStor’s claimed data compression level (up to 40x) and the fact that accessing data doesn’t require re-inflation. According to an infographic the current available solutions for compression are topped at at most 8x:
- Hadoop LZO: 3x
- Compressed relational: 6x
- Flatfile Gzip: 7x
- Columnar: 8x
If such compression levels can be achieved frequently and the impact on other server resources (CPU, memory) is minimal, RainStor Big Data Analytics on Hadoop will definitely be an interesting part of the Hadoop market.
Before leaving you with the infographic, here is a nice quote form RainStor CEO, John Bantleman:
We see Hadoop as a platform like Linux, which needs solutions on top to deliver value.