BerkleyDB: All content tagged as BerkleyDB in NoSQL databases and polyglot persistence
The main things that helped were: very clean and simple APIs, set of Java tools we could use for it, very, very easy to setup up, good GUI administration tools. All the things you perhaps wouldn’t get if you’d look to an open source solution (nb: my emphasis).
You’d perhaps be amazed of what you can find in open source solutions like say Riak or this newcomer. No need to bash open source tools in general.
Even if I’ve been using Berkley DB for over 6 years, I very rarely heard stories about it. This presentation from Yammer tells the story of taking Berkley DB a long way:
In early 2011 Yammer set out to replace an 11 billion row PostgreSQL message delivery database with something a bit more scale-ready. They reached for several databases with which they were familiar, but none proved to be a fit for various reasons. Following in the footsteps of so few before them, they took the wheel of the SS Berkeley DB Java Edition and piloted it into the uncharted waters of horizontal scalability.
In this talk, Ryan will cover Yammer’s journey through log cleaner infested waters, being hijacked on the high seas by the BDB B-tree cache, and their eventual flotilla of a 45 node, 256 partition BDB cluster.
EngineYard’s Ines Sombra recorded a conversation with Mathias Meyer about NoSQL databases and their evolution towards more friendlier functionality, relational databases and their steps towards non-relational models, and a bit more on what polyglot persistence means.
Mathias Meyer is one of the people I could talk for days about NoSQL and databases in general with different infrastructure toppings and he has some of the most well balanced thoughts when speaking about this exciting space—see this conversation I’ve had with him in the early days of NoSQL. I strongly encourage you to download the mp3 and listen to it.
Original title and link: NoSQL and Relational Databases Podcast With Mathias Meyer ( ©myNoSQL)
A bit after posting my predictions about the Oracle NoSQL database, I’ve received a link to a PDF introducing the Oracle NoSQL database, embedded below for your reference.
- based on BerkleyDB Java Edition. Thus it is a key-value store
a commercial productavailable as a Community edition and an Enterprise edition
- single-master with multireplicas.
- PAXOS-based automated fail-over master election
- supports configurable consistency policies
- update: there’s no download available yet, the term mentioned being mid-October
Update: There’s an official product page: Oracle NoSQL Database Technical Overview.
Oracle NoSQL database key features:
- Simple Data Model
- Key-value pair data structure, keys are composed of Major & Minor keys
- Easy-to-use Java API with simple Put, Delete and Get operations
- Automatic, hash-function based data partitioning and distribution
- Intelligent NoSQL Database driver is topology and latency aware, providing optimal data access
- Predictable behavior
- ACID transactions, configurable globally and per operation
- Bounded latency via B-tree caching and efficient query dispatching
- High Availability
- No single point of failure
- Built-in, configurable replication
- Resilient to single and multi-storage node failure
- Disaster recovery via data center replication
- Easy Administration
- Web console or command line interface
- System and node management
- Shows system topology, status, current load, trailing and average latency, events and alerts
There’s been a lot of speculation about the announcements coming from Oracle’s OpenWorld event. A first part was revealed during the keynote in the form of an in-memory analytics appliance called Exalytics . But there’s talk about a Big Data Appliance and an Oracle NoSQL database.
Here’re my predictions
Oracle became very aggressive in selling products based on hardware, software, and services. So they’ll announce a Hadoop appliance integrated with an existing Oracle product. It could be either the Oracle Exadata or even the newly announced Exalytics.
This appliance will place Oracle in competition with all other Hadoop appliance sellers: EMC, NetApp, IBM. Also these days most of the analytics databases try to integrate with Hadoop.
Oracle already has a couple of non-relational solutions in their portfolio: BerkleyDB, TimesTen, Coherence. And they’ve already started to test the NoSQL market by announcing the MySQL and MySQL Cluster NoSQL hybrid systems.
I don’t expect Oracle NoSQL database to be a new product. Just a rebranding or repackaging of one of the above mentioned ones. Probably the TimesTen.
Oracle will invest more into integrating its line of products with Hadoop. Having both a Hadoop and an in-memory analytics appliance will make them very competitive in this space.
Oracle will extend the support for NoSQLish interfaces (memcached) to its other database products.
What are your predictions?
Original title and link: The Oracle NoSQL Database and Big Data Appliance ( ©myNoSQL)
Basho team has ☞ announced the release of Riak 0.11.0 which features a couple enhancements and bug fixes. But more importantly the new Riak 0.11.0 is using in-house developed Bitcask storage so replacing the embedded InnoDB store and other previously available options.
Bitcask has been ☞ announced a while ago as a solution developed to address the following goals:
- low latency per item read or written
- high throughput, especially when writing an incoming stream of random items
- ability to handle datasets much larger than RAM w/o degradation
- crash friendliness, both in terms of fast recovery and not losing data
- ease of backup and restore
- a relatively simple, understandable (and thus supportable) code structure and data format
- predictable behavior under heavy access load or large volume
- a license that allowed for easy default use in Riak
Jeff Darcy has some ☞ very good things to say about Bitcask, so I’ve spent some time reading the ☞ technical paper (pdf). While not an expert in either Bitcask or BerkleyDB/Java I have found the top level goals and some of the implementation details quite similar, but I’m pretty sure there are some subtle differences as BerkleyDB is referred to in the paper (nb maybe it was just about the license).