NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



NoSQL releases: All content tagged as NoSQL releases in NoSQL databases and polyglot persistence

Hortonworks Data Platform 1.0

Hortonworks has announced the 1.0 release of the Hortonworks Data Platform prior to the Hadoop Summit 2012 together with a lot of supporting quotes from companies like Attunity, Dataguise, Datameer, Karmasphere, Kognitio, MarkLogic, Microsoft, NetApp, StackIQ, Syncsort, Talend, 10gen, Teradata, and VMware.

Some info points:

  1. Hortonworks Data Platform is a platform meant to simplify the installation, integration, management, and use of Apache Hadoop


    1. HDP 1.0 is based on Apache Hadoop 1.0
    2. Apache Ambari is used for installation and provisioning
    3. The same Apache Amabari is behind the Hortonworks Management Console
    4. For Data integration, HDP offers WebHDFS, HCatalog APIs, and Talend Open Studio
    5. Apache HCatalog is the solution offering metadata and table management
  2. Hortonworks Data Platform is 100% open source—I really appreciate Hortonworks’s dedication to the Apache Hadoop project and open source community

  3. HDP comes with 3 levels of support subscriptions, pricing starting at $12500/year for a 10 nodes cluster

One of the most interesting aspects of the Hortonworks Data Platform release is that the high-availability (HA) option for HDP is based on using VMWare-powered virtual machines for the NameNode and JobTracker. My first thought about this approach is that it was chosen to strengthen a partnership with VMWare. On the other hand, Hadoop 2.0 contains already a new highly-available version of the NameNode (Cloudera Hadoop Distribution uses this solution) and VMWare has bigger plans for a virtualization-friendly Hadoop environment with project Serengeti.

You can read a lot of posts about this announcement, but you’ll find all the details in Hortonworks’s John Kreisa’s post here and the PR announcement.

Original title and link: Hortonworks Data Platform 1.0 (NoSQL database©myNoSQL)

HBase 0.94 Released: What’s New

With over 350 enhancements and bug fixes, 0.94 is the new major release of HBase. This Cloudera blog post does a good summary of the most interesting improvements:

  • Read caching improvements
  • Seek optimizations
  • WAL writes optimizations
  • added functionality to HBck: fixing orphaned regions, region holes, overlapping regions
  • simplified region sizing
  • atomic Put & Delete in a single transaction

Original title and link: HBase 0.94 Released: What’s New (NoSQL database©myNoSQL)

Cassandra 1.1 Released: What’s New

There are a lot of interesting new features and improvements in the newly released Cassandra 1.1 version to cover them all here, but here’s the gist of them:

  1. Schema improvements
    1. Support for compound keys
    2. Concurrent schema changes
  2. A new version of Cassandra Query Language (CQL3) supporting compound keys and wide rows
  3. Better and easier tuning of the key and row caches
  4. Support for per-table hybrid storage —mixing SSDs and spinning disks

This DataStax’s blog entry provides links to more details about all these features and the others I haven’t enumerated above.

Original title and link: Cassandra 1.1 Released: What’s New (NoSQL database©myNoSQL)

NoSQL Releases and Announcements

Catching up after almost two weeks offline is no easy task, but I hope I’ll not miss any important events, releases, or posts. But if I do, please email me.

Cassandra 1.0.9: Maintenance Release

The complete change notes for Cassandra 1.0.9 are here:

  • improve index sampling performance (CASSANDRA-4023)
  • always compact away deleted hints immediately after handoff (CASSANDRA-3955)
  • delete hints from dropped ColumnFamilies on handoff instead of erroring out (CASSANDRA-3975)
  • add CompositeType ref to the CLI doc for create/update column family (CASSANDRA-3980)
  • Avoid NPE during repair when a keyspace has no CFs (CASSANDRA-3988)
  • Fix division-by-zero error on get_slice (CASSANDRA-4000)
  • don’t change manifest level for cleanup, scrub, and upgradesstables operations under LeveledCompactionStrategy (CASSANDRA-3989, 4112)
  • fix race leading to super columns assertion failure (CASSANDRA-3957)
  • ensure that directory is selected for compaction for user-defined tasks and upgradesstables (CASSANDRA-3985)
  • allow custom types in CLI’s assume command (CASSANDRA-4081)
  • fix totalBytes count for parallel compactions (CASSANDRA-3758)
  • fix intermittent NPE in get_slice (CASSANDRA-4095)
  • remove unnecessary asserts in native code interfaces (CASSANDRA-4096)
  • Fix EC2 snitch incorrectly reporting region (CASSANDRA-4026)
  • Shut down thrift during decommission (CASSANDRA-4086)
  • Merged from 0.8: Fix ConcurrentModificationException in gossiper (CASSANDRA-4019)

  • Pig

    • support Counter ColumnFamilies (CASSANDRA-3973)
    • Composite column support (CASSANDRA-3684)
  • CQL

    • fix NPE on invalid CQL delete command (CASSANDRA-3755)
    • Validate blank keys in CQL to avoid assertion errors (CASSANDRA-3612)

Apache Hadoop User Impersonation vulnerability

This vulnerability discovered by Cloudera’s Aaron T. Myers affects Hadoop’s versions,,, 1.0.0 to 1.0.1, and 0.23.0 to 0.23.1 where Kerberos is enabled. Complete details available here.

CouchDB 1.2.0

This is the first important release after the start of the year CouchDB hubbub with Damien Katz and Couchbase. The new version is a major release in itself deserving its own post: CouchDB 1.2.0: Performance, Security, API, Core and Replication Improvements.

Riak 1.1.2: Stabilization release

Just a maintenance release in the Riak 1.1 series. Complete release notes here.

Original title and link: NoSQL Releases and Announcements (NoSQL database©myNoSQL)

CouchDB 1.2.0: Performance, Security, API, Core and Replication Improvements

CouchDB 1.2.0 was released on April 6th. The linked post provides all the details of the new version, but here are some important improvements included with the new release:

  • Performance: added a native JSON parser
  • Performance: optional file compression for database and view index files
  • Performance: a new replicator implementation. More reliable, faster, configurable.
  • Security: the _users database and information in the _replication databases are not longer readable by everyone
  • Core: added support for automatic compaction. Automatic compaction is off by default, but can be enabed through Futon or the .ini file and configured to run based on multiple variables:
    • A threshold for the file_size to disk_size ratio (say 70%)
    • A time window specified in hours and minutes (e.g 01:00-05:00)
    • Compaction can be cancelled if it exceeds the closing time.
    • Compaction for views and databases can be set to run in parallel
    • If there’s not enough space (2 × data_size) on the disk to complete a compaction, an error is logged and the compaction is not started.

Original title and link: CouchDB 1.2.0: Performance, Security, API, Core and Replication Improvements (NoSQL database©myNoSQL)


Graph Databases Updates: DEX Graph Database 4.5 and Neo4j 1.7 Milestone 1

Two new releases in the graph databases space:

DEX Graph Database 4.5

The new DEX Graph Database release comes with pre-packaged graph algorithms—breadth and depth first traversal, shortest path, Gabow connectivity—available for Java, .NET, and C++. You can get the new version from here.

Neo4j 1.7 Milestone 1

As per Neo4j 1.7 milestone 1 update, this version features:

  • improved Cypher
  • SSL support
  • improved Neo4j documentation
  • high availability improvements (nb: there are recommended maintenance releases for Neo4j 1.5 and 1.6)
  • upgraded Blueprints and Gremlin support

You can get Neo4j 1.7 from here.

Original title and link: Graph Databases Updates: DEX Graph Database 4.5 and Neo4j 1.7 Milestone 1 (NoSQL database©myNoSQL)

Major Riak Release Includes Tons of Improvements, Plus a Riak Admin UI and Riaknostic

One of the major releases that happened around the end of February (and I’ve missed due to some personal problems), is Riak 1.1. I assume that by now everyone using Riak already knows all the goodies packaged by the Basho team in this new release, but for those that are not yet onboard here is a summary:

From the Release notes:

  • Numerous changes to Riak Core which address issues with cluster scalability, and enable Riak to better handle large clusters and large rings
  • New Ownership Claim Algorithm: The new ring ownership claim algorithm introduced as an optional setting in the 1.0 release has been set as the default for 1.1. The new claim algorithm significantly reduces the amount of ownerhip shuffling for clusters with more than N+2 nodes in them.
  • Riak KV improvements:
    • Liskeys backpressure: Backpressure has been added to listkeys to prevent the node listing keys from being overwhelemed.
    • Don’t drop post-commit errors on floor
  • MapReduce Improvements
    • The MapReduce interface now supports requests with empty queries. This allows the 2i, list-keys, and search inputs to return matching keys to clients without needing to include a reduce_identity query phase.
    • MapReduce error messages have been improved. Most error cases should now return helpful information all the way to the client, while also producing less spam in Riak’s logs.
  • Bitcask and LevelDB improvements

Then there’s also Riaknostic and the new Riak admin tool: Riak Control.

What is Riaknostic?

From the initial Riaknostic announcement:

Riaknostic is an Erlang script (escript) that runs a series of “diagnostics” or “checks”, inspecting your operating system and Riak installation for known potential problems and then printing suggestions for how to fix those problems. Riaknostic will NOT fix those problems for you, it’s only a tool for diagnostics. Some of the things it checks are:

  • How much memory does the Riak process currently use?
  • Do Riak’s data directories have the correct permissions?
  • Did the Riak node crash in the past and leave a dump file?

Riaknostic project page is here.

What is Riak Control?

From Riak Control GitHub page:

Riak Control is a set of webmachine resources, all accessible via the /admin/* paths, allow you to inspect your running cluster, and manipulate it in various ways.

Now that description doesn’t make Riak Control any justice. What Riak Control is a very fancy REST-driven admin interface for Riak. You don’t have to take my word for it, so check this screenshot:

Riak Control

Riak Control covers different details of a Riak cluster:

  • general cluster status
  • details about the cluster
  • details about the ring

This blog post gives more details about Riak Control and a couple more sexy screenshots. If you’d like to dive a bit deeper into Riak Control, you can also watch after the break a 25min video of Mark Phillips talking about it.

Since the Riak 1.1.0 release, there has been a bug fix release 1.1.1 addressing some MapReduce bugs described on the mailing list but also on the Riak 1.1.1 release notes.

Riak and WebMachine are the two systems for which I wished I knew Erlang so I could dive into and learn more about. I’m already (slowly) working to change this.

Apache Cassandra 1.0.8 Maintenance Release

A new maintenance release for Apache Cassandra including over 30 bug fixes. Complete release notes can be found here.

Original title and link: Apache Cassandra 1.0.8 Maintenance Release (NoSQL database©myNoSQL)

Redis 2.6 Is Near

The old list of features to be included in Redis 2.6 got a lot longer and Salvatore Sanfilippo provides an updated version of the goodies to come:

Well, for one time, a delay is not a signal that something is wrong. What happened is simply that we put a lot more than expected inside this release, so without further delays here is a list of new features

Redis 2.6 will be a major release. Quality or Death!

Original title and link: Redis 2.6 Is Near (NoSQL database©myNoSQL)


InfiniteGraph 2.1 Features Gremlin Support and a Plugin Framework

A new version of InfiniteGraph, the graph database from Objectivity, was announced today. This release features:

  • a plugin framework: Two kinds of plugins are supported. A navigator plugin bundles components that assist in navigation queries, such as result qualifiers, path qualifiers, and guides. The Formatter plugin formats and outputs results of graph queries.
  • enhanced IG Visualizer: The advanced Visualizer is now tightly integrated with InfiniteGraph’s Plugin Framework allowing indexing queries for edges, the Formatter plugin framework export GraphML and JSON (built-in) or other user defined plugin formats.
  • support for Tinkerpop Blueprints and Gremlin: InfiniteGraph provides a clean integration with Blueprints that is well suited for applications that want to traverse and query graph databases using Gremlin

A bit more details can be found in the InfiniteGraph 2.1 release notes.

Klint Finley

Original title and link: InfiniteGraph 2.1 Features Gremlin Support and a Plugin Framework (NoSQL database©myNoSQL)

Neo4j 1.6 GA Release: Heroku, Cypher, Lucene 3.5

Announced last week, Jörn Kniv aka Neo4j 1.6 features:

  • Improved Cypher (the query language)
  • Web admin - Full Neo4j Shell commands, including versioned Cypher syntax.
  • Kernel improvements
  • Upgraded Lucene version to 3.5.

Also the Neo guys have been pushing quite a bit their public beta Heroku add-on.

Original title and link: Neo4j 1.6 GA Release: Heroku, Cypher, Lucene 3.5 (NoSQL database©myNoSQL)

More Details About Apache HBase 0.92.0

Jonathan Hsieh provides a summary of the new features in HBase 0.92.0 by splitting them into user features:

  • HFile v2, a new more efficient storage format
  • Faster recovery via distributed log splitting
  • Lower latency region-server operations via new multi-threaded and asynchronous implementations.

operator features:

  • An enhanced web UI that exposes more internal state
  • Improved logging for identifying slow queries
  • Improved corruption detection and repair tools

and developer features:

  • Coprocessors
  • Build support for Hadoop 0.20.20x, 0.22, 0.23.
  • Experimental: offheap slab cache and online table schema change

Earlier today when covering the HBase 0.92.0 release, I wrote that coprocessors are the hightlight of this release. I’ll take that back. Way too many interesting features in HBase 0.92.0 to highlight just one of them.

Original title and link: More Details About Apache HBase 0.92.0 (NoSQL database©myNoSQL)