releases: All content tagged as releases in NoSQL databases and polyglot persistence
Tuesday, 22 February 2011
Redis 2.2: An Optimization Release
Salvatore Sanfilippo summarizes the new Redis release in the Hacker News thread:
2.2 was exactly an “optimization” release, to bring what we had at a better level of maturity.
Basically we’ll try hard to don’t add things to the API in the next releases, but just to open to new use cases changing the “backend” part, with cluster support for large fault tolerant deployment, and with diskstore for “bigdata”.
However there are a few important new things in Redis 2.2 from the point of view of the features, I think the main ones are:
- non blocking replication, so that now slaves are able to serve data even when trying to resync with the master.
- Check and Set with
WATCH.- Write operations against keys with an expire set.
- LRU eviction of keys in ‘maxmemory’ mode.
- Support for
SETBIT/GETBIT/SETRANGE/GETRANGE, basically this turn the string data type into a random access array.
The more verbose release notes are available here.
Redis 2.2 is a drop in replacement for the previous 2.0 version. Though there are some changes in the return values for edge cases.
Update: Minutes ago, Salvatore has announced that there will be a new release today to fix an urgent bug in SPOP.
Update: Redis 2.2.1 is out. You can get it from here.
Original title and link: Redis 2.2: An Optimization Release (NoSQL databases © myNoSQL)
Thursday, 17 February 2011
Cassandra Releases: Two Minor Upgrades
Cassandra has pushed out two new minor releases, first of them, 0.7.1, featuring a couple of performance improvements and new features, and the second, 0.7.2, fixing a critical bug in the 0.7.1 release.
Cassandra 0.7.1 Performance Improvements
- Disk writes and sequential scans avoid polluting page cache (requires JNA to be enabled)
- Cassandra performs writes efficiently across datacenters by sending a single copy of the mutation and having the recipient forward that to other replicas in its datacenter.
- Improved network buffering
- Reduced lock contention on memtable flush
- Optimized supercolumn deserialization
- Zero-copy reads from mmapped sstable files
- Explicitly set higher JVM new generation size
- Reduced i/o contention during saving of caches
Cassandra 0.7.1 New Features
- added
flush_largest_memtables_atandreduce_cache_sizes_atoptions tocassandra.yamlas an escape valve for memory pressure - added option to specify
-Dcassandra.join_ring=falseon startup to allow “warm spare” nodes or performing JMX maintenance before joining the ring
The complete list of changes can be found here:
Original title and link: Cassandra Releases: Two Minor Upgrades (NoSQL databases © myNoSQL)
Thursday, 10 February 2011
InfiniteGraph 1.1 Released with New Indexing Options
A new version of InfiniteGraph, the graph database from Objectivity, has been released with a new indexing solution offering improved performance for indexing, data imports, and lookups.
InfiniteGraph’s graph processing strengths are well suited to many applications, including those in intelligence, internet systems and services around social media, location based networking and personalization, discovering networks of people that have business, influence or other value, analysis of financial transactions to detect and prevent fraud, and in adding new capabilities to enterprise business intelligence (BI) systems.
For next releases, it sounds like a lot of work is already scheduled, InfiniteGraph’s team planning to focus on:
- improving data import
- parallel ingest capabilities leveraging the distributed processing strengths of InfiniteGraph
- integrating with the open source Blueprints project
- faster graph processing
- range querying and geo-hashed indexes
- options to relax InfiniteGraph’s fully ACID compliant consistency model
Things in the graph database space are getting more exciting by the day. Unfortunately compared to the other NoSQL databases categories, the top graph databases are all commercial products and I think this can be notice when looking at adoption rates.
Original title and link: InfiniteGraph 1.1 Released with New Indexing Options (NoSQL databases © myNoSQL)
Friday, 28 January 2011
CouchDB 1.0.2: 3rd is Lucky
You’d assume that the more mature a project gets the less interesting a point release would be. But this doesn’t seem to apply to the NoSQL databases, where with each new release we are seeing new exciting features. Only from this month: Neo4j 1.2, Cassandra 0.7, HBase 0.90.0, and upcoming MongoDB 1.8.
After two attempts to announce CouchDB 1.0.2 back in December, both stopped in a very last moment by issues that the community considered mandatory to fix before the release, today the CouchDB people are finally announcing the availability of CouchDB 1.0.2.
You can find the list of changes in CouchDB 1.0.2 here:
- Significantly higher read and write throughput against database and view index files.
- Reduce lengthy stack traces.
- Allow reduce=false parameter in map-only views.
- Fix databases forgetting their validation function after compaction.
- Fix occasional timeout errors after successfully compacting large databases.
- Fix ocassional error when writing to a database that has just been compacted.
- Fix occasional timeout errors on systems with slow or heavily loaded IO.
- Fix for OOME when compactions include documents with many conflicts.
- Fix for missing attachment compression when MIME types included parameters.
- Preserve purge metadata during compaction to avoid spurious view rebuilds.
- Fix spurious conflicts introduced when uploading an attachment after a doc has been in a conflict. See COUCHDB-902 for details.
- Fix for frequently edited documents in multi-master deployments being duplicated in _changes and _all_docs. See COUCHDDB-968 for details on how to repair.
- Fix authenticated replication (with HTTP basic auth) of design documents with attachments.
- Various fixes to make replication more resilient for edge-cases.
- Don’t trigger view updates when requesting
_design/doc/_info.- Documents are now sealed before being passed to map functions.
- Force view compaction failure when duplicated document data exists. When this error is seen in the logs users should rebuild their views from scratch to fix the issue. See COUCHDB-999 for details.
Third attempt is always lucky! Congrats!
Original title and link: CouchDB 1.0.2: (NoSQL databases © myNoSQL)
Wednesday, 19 January 2011
HBase 0.90.0 Released: Over 1000 Fixes and Improvements
As far as I know this is the first major HBase release since becoming a top level Apache project (this using a new versioning too). Until now I thought that Hadoop 0.21.0 had the longest list of fixes, improvements, and new features, but I guess HBase 0.90.0 tops that with over 1000 tracked tickets.
I bet there are quite a few exciting things among these over 1000 tickets, but for now I’d suggest taking a look at the slides below from HUG11:
From a slides, a quick what’s new in HBase 0.90.0:
- durability and stability
- HDFS appends + WAL improvements
- master rewrite
- cleanup of master, move region transitions to ZK
- inter-cluster/inter-DB replication
- Bloom filters
- bul loading improvements
- performance improvements
- peripheral improvements: REST/Stargate, Shell, Avro,
- HBaseFSCK
Note: HBase coprocessors are scheduled for 0.92
On a negative side, HBase 0.90.0 doesn’t run with Hadoop 0.21.0 nor with Hadoop TRUNK, the only compatible Hadoop version being 0.20.x. The release notes for HBase 0.90 release candidates are mentioning that HBase will lose data unless running on an Hadoop HDFS 0.20.x that has a durable sync. Though there is a Hadoop branch containing the necessary changes, but you’ll have to build that yourself. Update: see Nicolas’ comment below about Hadoop 0.21 being just a development version.
Congrats to the HBase team for their first release as top Apache project!
I didn’t know about the The Apache HBase book. But I’m eagerly awaiting my copy of Lars George’s HBase: The definitive guide.
Update: the official announcement went out
Original title and link: HBase 0.90.0 Released: Over 1000 Fixes and Improvements (NoSQL databases © myNoSQL)
Monday, 17 January 2011
Cassandra 0.7: Large Row Support
Something I’ve missed from the what’s new in Cassandra 0.7:
The other big new feature is large row support for up to two billion columns per row. In previous Cassandra releases, there was a limit where a single column value could not be larger than 2 GB.
But number of columns vs size of columns data is quite different…
Original title and link: Cassandra 0.7: Large Row Support (NoSQL databases © myNoSQL)
Thursday, 13 January 2011
Neo4j 1.2: What’s New
Neo4j 1.2 was released on December 30th. Now that’s a very weird time to make a major release. But according to the Neo4j roadmap and milestone reports, Neo4j 1.2 brings quite a few major changes and improvements.
First major shift in Neo4j direction is that it is now available as a RESTful server. Even if a Neo4j REST API existed before, this shift from promoting an embedded graph database to a full blown RESTful graph database was firstly announced with the first 1.2 milestone. As someone suggesting this change, I cheer the decision.
The second major feature is the high availability Neo4j cluster. Most of the existing graph databases have started their life as embedded storage solutions. Then a few of them have seen the light of becoming server-based storage solutions. But with that also came questions related to availability and scalability.
Starting with this version, Neo4j offers the option of setting up a high availability cluster and this is a major step forward for graph databases. This is still a first version where writes are slower, the cluster is not elastic, and there are limitations at the distributed transaction layer.
Scaling graph databases remains a very complicated problem to be solved. Darren Wood’s1 presentation covers some of the challenges of distributed graph databases
Neo4j 1.2 features a couple of more goodies like a smaller footprint kernel and an automatic JMX enabled monitoring and management component.
The original announcement covers more details about this major Neo4j new version. The only missing piece from this release and announcement is a document describing Neo4j API changes. But that should not stop you from trying it out.
-
Darren Wood: Architect at InfiniteGraph/Objectivity ↩
Original title and link: Neo4j 1.2: What’s New (NoSQL databases © myNoSQL)
Wednesday, 12 January 2011
Apache Pig 0.8: What is New
Dmitriy Ryaboy1 has a guest post on Cloudera blog covering the new features in Apache Pig 0.8.
Summarized:
- Support for user defined functions (UDF) in scripting languages
- Generic UDFs: allows invocation of static java methods
- PigUnit: as the name suggests, a testing tool for Pig scripts
- PigStats: once again the name should give you a hint of what it does: better visibility into Pig job through a series of stats, XML-based metadata injected into Map-Reduce jobs, and listeners for the Pig process
- Scalar values: simplifying access to single-row relations
- possibility to start a monitoring thread for long running executions
- HBaseStorage: works with HBase 0.20 releases only
- flow allows custom Map-Reduce jobs
- automatic merge of small files
- custom partitioners
The Pig 0.8 release includes a large number of bug fixes and optimizations, but at the core it is a feature release. It’s been in the works for almost a full year and the amount of time spent on 0.8 really shows.
You can also check Dmitriy’s presentations about the NoSQL ecosystem at Twitter: Twitter, Pig, and HBase and HBase and Pig: The Hadoop ecosystem at Twitter
-
Dmitriy Ryaboy: Twitter engineer, @squarecog ↩
Original title and link: Apache Pig 0.8: What is New (NoSQL databases © myNoSQL)
via: http://www.cloudera.com/blog/2010/12/new-features-in-apache-pig-0-8/
Tuesday, 11 January 2011
Cassandra 0.7 Released, Lots of Goodies in the Box
The much awaited new version of Cassandra has been quietly release a couple of days ago. As mentioned in Cassandra 2010 in review, this version brings a lot of interesting new features:
- memory efficient compactions
- online schema changes
- secondary indexes
- improved performance for reads
- upgraded Thrift
The list of updates is too long, so for start I recommend Gary Dusbabek’s nice post summarizing most important new features.
Then on Riptano’s blog, there’s a series of articles getting into the details of these features:
- Jonathan Ellis: Live schema updates
- Jonathan Ellis: Secondary indexes
- Brandon Williams: Hadoop output to Cassandra
- Sylvain Lebresne: Expiring columns
I guess the only major feature that was talked about and didn’t get in this release is the distributed counters, but that’s already in the Cassandra trunk, so sooner than later users will get it.
Now you can head to the download page and start upgrading your Cassandra cluster.
Original title and link: Cassandra 0.7 Released, Lots of Goodies in the Box (NoSQL databases © myNoSQL)
Wednesday, 5 January 2011
Riak 0.14 Released with MapReduce Enhancements, Cluster and Node Debugging
I’ve been waiting for the first NoSQL release to post the first time in 2011. So thanks to Basho’s announcement of Riak 0.14, myNoSQL is back officially.
Riak 0.14 is featuring those Map/Reduce improvements I’ve already written about[1]
and quite a few other interesting features and improvements:
- Cluster and node debugging: The ability to monitor and debug a running Riak cluster received some substantial enhancements in 0.14.
- Windowed merges for Bitcask: Bitcask performs periodic merges over all non-active files to compact the space being occupied by old versions of stored data. In certain situations this can cause some memory and CPU spikes on the Riak node where the merge is taking place. To that end, we’ve added the ability to specify when Bitcask will perform merges.
- Support for HTTPS and multiple HTTP IPs
- REST API for listing buckets
Complete release notes available here.
- As a side note, Kevin Smith, the Basho engineer that presented these enhancements first has moved to work for Heroku. (↩)
Original title and link: Riak 0.14 Released with MapReduce Enhancements, Cluster and Node Debugging (NoSQL databases © myNoSQL)
Wednesday, 15 December 2010
Now official: Spring Data Riak Support Reaches Milestone 1
Shortly after announcing Redis support in Spring Data and just days after Grails got support for Riak, Spring Data is announcing the 1st milestone of Riak support. The same Costin Leau:
The features in 1.0.0 M1 include:
- Generified RiakTemplate for exception translation, serialization, and data access
- Built-in HTTP REST client based on Spring 3.0 RestTemplate
java.ioand Spring IO resource abstractions for reading/writing streamsjava.io.Filesubclass that represents a Riak resource
Looks like the Springframework NoSQL train is in full movement now.
Original title and link: Now official: Spring Data Riak Support Reaches Milestone 1 (NoSQL databases © myNoSQL)
Tuesday, 14 December 2010
Cascading 1.2 Released
A bit late with the post, but here is Cascading 1.2:
This release features many performance and usability enhancements while remaining backwards compatible with 1.0 and 1.1. Specifically:
- Performance optimizations during grouping (StreamComparator)
- Composable map-side partial aggregations (AggregateBy)
- Native Riffle support for non-Cascading (or nested iterative Cascading) processes (ProcessFlow and Riffle)
Cascading is part of the extensive Hadoop tooling ecosystem.
Original title and link: Cascading 1.2 Released (NoSQL databases © myNoSQL)
Most Popular Articles
- Translate SQL to MongoDB MapReduce
- Tutorial: Getting Started With Cassandra
- CouchDB vs MongoDB: An attempt for a More Informed Comparison
- Cassandra @ Twitter: An Interview with Ryan King
- A Couple of Nice GUI Tools for MongoDB
- NoSQL benchmarks and performance evaluations
- Ehcache: Distributed Cache or NoSQL Store?
- Document Databases Compared: CouchDB, MongoDB, RavenDB
- Quick Review of Existing Graph Databases
- NoSQL Data Modeling