column store: All content tagged as column store in NoSQL databases and polyglot persistence
Hortonworks has announced the 1.0 release of the Hortonworks Data Platform prior to the Hadoop Summit 2012 together with a lot of supporting quotes from companies like Attunity, Dataguise, Datameer, Karmasphere, Kognitio, MarkLogic, Microsoft, NetApp, StackIQ, Syncsort, Talend, 10gen, Teradata, and VMware.
Some info points:
Hortonworks Data Platform is a platform meant to simplify the installation, integration, management, and use of Apache Hadoop
- HDP 1.0 is based on Apache Hadoop 1.0
- Apache Ambari is used for installation and provisioning
- The same Apache Amabari is behind the Hortonworks Management Console
- For Data integration, HDP offers WebHDFS, HCatalog APIs, and Talend Open Studio
- Apache HCatalog is the solution offering metadata and table management
Hortonworks Data Platform is 100% open source—I really appreciate Hortonworks’s dedication to the Apache Hadoop project and open source community
- HDP comes with 3 levels of support subscriptions, pricing starting at $12500/year for a 10 nodes cluster
One of the most interesting aspects of the Hortonworks Data Platform release is that the high-availability (HA) option for HDP is based on using VMWare-powered virtual machines for the NameNode and JobTracker. My first thought about this approach is that it was chosen to strengthen a partnership with VMWare. On the other hand, Hadoop 2.0 contains already a new highly-available version of the NameNode (Cloudera Hadoop Distribution uses this solution) and VMWare has bigger plans for a virtualization-friendly Hadoop environment with project Serengeti.
Original title and link: Hortonworks Data Platform 1.0 ( ©myNoSQL)
Two posts by Oliver Meyn on measuring the performance of two HBase clusters—first results on the original cluster and results on the upgraded cluster— using
org.apache.hadoop.hbase.PerformanceEvaluation, the resulting performance charts, Ganglia charts, and some thoughts and feedback from the HBase community.
Original title and link: Performance Evaluation of HBase and How Hardware Changes Results ( ©myNoSQL)
- Read caching improvements
- Seek optimizations
- WAL writes optimizations
- added functionality to HBck: fixing orphaned regions, region holes, overlapping regions
- simplified region sizing
- atomic Put & Delete in a single transaction
Original title and link: HBase 0.94 Released: What’s New ( ©myNoSQL)
EngineYard’s Ines Sombra recorded a conversation with Mathias Meyer about NoSQL databases and their evolution towards more friendlier functionality, relational databases and their steps towards non-relational models, and a bit more on what polyglot persistence means.
Mathias Meyer is one of the people I could talk for days about NoSQL and databases in general with different infrastructure toppings and he has some of the most well balanced thoughts when speaking about this exciting space—see this conversation I’ve had with him in the early days of NoSQL. I strongly encourage you to download the mp3 and listen to it.
Original title and link: NoSQL and Relational Databases Podcast With Mathias Meyer ( ©myNoSQL)
There are a lot of interesting new features and improvements in the newly released Cassandra 1.1 version to cover them all here, but here’s the gist of them:
- Schema improvements
- Support for compound keys
- Concurrent schema changes
- A new version of Cassandra Query Language (CQL3) supporting compound keys and wide rows
- Better and easier tuning of the key and row caches
- Support for per-table hybrid storage —mixing SSDs and spinning disks
This DataStax’s blog entry provides links to more details about all these features and the others I haven’t enumerated above.
Original title and link: Cassandra 1.1 Released: What’s New ( ©myNoSQL)