NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



MongoDB: All content tagged as MongoDB in NoSQL databases and polyglot persistence

IBM and 10gen are collaborating on a standard that would make it easier to write applications that can access data from both MongoDB and relational systems such as IBM DB2

The details are pretty confusing1

[…] the new standard — which encompasses the MongoDB API, data representation (BSON), query language and wire protocol — appears to be all about establishing a way for mobile and other next-generation applications to connect with enterprise database systems such as IBM’s popular DB2 database and its WebSphere eXtreme Scale data grid.

But the juicy part is in the comments; if you can ignore the pitches.

  1. if this is a new standard and it is all based on the already existing MongoDB API, BSON, and wire protocol, then 1) what’s new about it and 2) what exactly will make it a standard

Original title and link: IBM and 10gen are collaborating on a standard that would make it easier to write applications that can access data from both MongoDB and relational systems such as IBM DB2 (NoSQL database©myNoSQL)


What is TokuMX fractal tree-based storage?

A post on Tokutek’s blog explaining TokuMX, the fractal tree-based storage engine for MongoDB:

TokuMX has replaced ALL of the storage code in MongoDB with fractal trees. […]

TokuMX achieves high compression for the same reason TokuDB for MySQL does: fractal trees compress really well by ensuring they compress data in large chunks. TokuMX achieves high insertion rates on index-rich collections for the same reason TokuDB for MySQL performs so well on iiBench, fractal trees are a write-optimized data structure designed to maintain insertion performance on larger than memory workloads. TokuMX does not require constant compaction for the same reason that TokuDB for MySQL does not require users to constantly run “optimize table” to reorganize data, fractal trees don’t fragment. MongoDB and MySQL are very different products with very different user experiences, but the underlying data structure of their storage is the same: the B-Tree. Fractal trees are better.

The post has a lot of links to go through.

✚ Has Tokutek published any papers about the fractal tree engine? I remember reading that the technology was waiting to be patented, but I don’t think I’ve found any papers about it.

Original title and link: What is TokuMX fractal tree-based storage? (NoSQL database©myNoSQL)


MongoDB Indexes - I helped a customer optimize his MongoDB

Recently, I helped a cus­tomer opti­mize his data­base. Write lock on the data­base was run­ning con­sis­tently at 95%. CPU was spik­ing con­sis­tently, and mak­ing for a poor expe­ri­ence.

How long until we’ll see profitable consulting businesses focused on optimizing MongoDB? Wait… we already have them.

Original title and link: MongoDB Indexes - I helped a customer optimize his MongoDB (NoSQL database©myNoSQL)


New Geo Features in MongoDB 2.4

The primary conceptual difference (though there are also many functional differences) between the 2d and 2dsphere indexes, is the type of coordinate system that they consider. Planar coordinate systems are useful for certain applications, and can serve as a simplifying approximation of spherical coordinates. As you consider larger geometries, or consider geometries near the meridians and poles however, the requirement to use proper spherical coordinates becomes important.

I don’t know anything about geo, so I’ll leave this up for experts to comment on.

✚ There’s actually something I like about this announcement: the fact that MongoDB decided to use an existing standard instead of coming up with its own custom solution.

Original title and link: New Geo Features in MongoDB 2.4 (NoSQL database©myNoSQL)


Memory-Mapped I/O in SQLite

Beginning with version 3.7.17, SQLite has the option of accessing disk content directly using memory-mapped I/O and the new xFetch() and xUnfetch() methods on sqlite3_io_methods.

As with the docs about atomic commits, this will be one of the best, succinct, and clear docs you’ll read about memory mapped files, the pros and cons, and how SQLite uses them.

If you are a MongoDB user you should read this.

✚ Check out the HN thread to see how many people love SQLite.

Original title and link: Memory-Mapped I/O in SQLite (NoSQL database©myNoSQL)


NoSQL and Full Text Indexing: Two Trends

On one side:

  1. DataStax with Solr
  2. MapR with LucidWorks Search (nb: Solr)

and on the other side:

  1. Riak Searching: Solr-like but custom prioprietary implementation
  2. MongoDB text search: custom prioprietary implementation

I’m not going to argue about the pros and cons of each of these approaches, but I’m sure you already know which of these approaches I’m in favor of.

Original title and link: NoSQL and Full Text Indexing: Two Trends (NoSQL database©myNoSQL)

How to use MongoDB Redis-style: pure in-memory database

Antoine Girbal:

One sweet design choice of MongoDB is that it uses memory-mapped files to handle access to data files on disk. This means that MongoDB does not know the difference between RAM and disk, it just accesses bytes at offsets in giant arrays representing files and the OS takes care of the rest! It is this design decision that allows MongoDB to run in RAM with no modification.

No pun intended, but until MongoDB added journaling (on by default since 2.0), I’ve always looked at MongoDB as an in-memory database. And, I have to confess that even after that, considering all the recommendations and stories I’m reading about MongoDB, I still perceive it as a mostly in-memory database.

Original title and link: How to use MongoDB Redis-style: pure in-memory database (NoSQL database©myNoSQL)


The MEAN Stack: MongoDB, ExpressJS, AngularJS and Node.js

MongoDB, ExpressJS, AngularJS and Node.js the MEAN stack or as the first commenter on the post called it: “the hipster stack”:

A few weeks ago, a friend of mine asked me for help with PostgreSQL. As someone who’s been blissfully SQL-­free for a year, I was quite curious to find out why he wasn’t just using MongoDB instead.

It’s all roses on the way to MongoDB.

Original title and link: The MEAN Stack: MongoDB, ExpressJS, AngularJS and Node.js (NoSQL database©myNoSQL)


10 questions to ask when hosting your database on AWS

Dharshan Rangegowda, founder of Scalegrid, posted a list of 10 questions that should be answered before hosting your MongoDB on AWS. But these are generic enough to extend to any database-on-AWS solution. They cover aspects like HA, backup and restore, monitoring, and basic security. If you haven’t done this before, save them as a quick check list.

✚ Just because you set up HA and backups, it doesn’t mean they’ll actually work when you need them. Test them over and over again. Make it part of your regular procedures.

Original title and link: 10 questions to ask when hosting your database on AWS (NoSQL database©myNoSQL)


MongoLab offers MongoDB on Google Cloud Platform

This was fast:

This week at Google I/O we are launching support for MongoLab‘s fifth cloud provider – Google Cloud Platform. You can now use MongoLab to provision and manage MongoDB deployments on Google Compute Engine (GCE)!

Good move for MongoLab and good win for MongoDB users. I’ve read a lot of good things about Google’s Cloud Platform.

Original title and link: MongoLab offers MongoDB on Google Cloud Platform (NoSQL database©myNoSQL)


MetLife uses MongoDB

InformationWeek, in an article about MetLife migrating to MongoDB:

“We had 60 different teams working together as one group, and they were working nights and weekends not because they had to but because they were excited and wanted to,” says Gary Hoberman, MetLife’s senior VP and CIO of regional application development.

Just imagine how many nights and weekends and holidays these guys would put in if allowed to use an IDE. Like vim or emacs.

Original title and link: MetLife uses MongoDB (NoSQL database©myNoSQL)


MongoDB's TTL Collections in OpenStack's Marconi queuing service

Flavio Percoco describing some workaround OpenStack’s queing system is when using MongoDB’s TTL collections:

Even though it is a great feature, it wasn’t enough to cover Marconi’s needs since the later supports per message TTL. In order to cover this, one of the ideas was to implement something similar to Mongodb’s thread and have it running server-side but we didn’t want that for a couple of reasons: it needed a separated thread / process and it had a bigger impact in terms of performance.

This got me thinking it might be one of the (few) features missing from Redis.

✚ Redis supports timeouts for keys. Redis 2.6 brought the accuracy of expiring keys from 1 second to 1 millisecond.

✚ Redis has support for different data structures like lists, sets, and sorted sets. But it’s missing the combination of the two.

Original title and link: MongoDB’s TTL Collections in OpenStack’s Marconi queuing service (NoSQL database©myNoSQL)