ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

document database: All content tagged as document database in NoSQL databases and polyglot persistence

Couchjs: Drop-In Replacement Javascript V8 Engine for Apache CouchDB

By the Iris Couch guys that are also providing Apache CouchDB cloud hosting:

couchjs is a command-line Node.js program. It is 100% compatible with Apache CouchDB’s built-in JavaScript system.

By using couchjs, you will get 100% CouchDB compatibility (the test suite completely passes) but your JavaScript environment is V8, or Node.js.

Original title and link: Couchjs: Drop-In Replacement Javascript V8 Engine for Apache CouchDB (NoSQL database©myNoSQL)

via: https://github.com/iriscouch/couchjs


Zynga Deploys MemSQL for Real-Time Service. Where Does This Leave Couchbase?

Derrick Harris reports for GigaOM about Zynga’s deployment of a MemSQL cluster:

Zynga has deployed nearly 100 nodes of MemSQL, the hot new database from two former Facebook engineers. It might not be a magic pill for Zynga’s woes, but it could help the company boost revenue and even build new types of games. […] At the very least, it could let the company do some things previously out of its reach, such as serve real-time recommendations and ads, and create advanced multi-player games.

Zynga has been the most prominent and most quoted production deployment for Couchbase. That despite the fact that Zynga has never run stock Couchbase, but a custom in-house version.

The story is clear that the new (100 nodes) MemSQL cluster is augmenting or replacing a part of the Zynga’s MySQL cluster. But they are using MemSQL to serve real-time recommendations and ads. A scenario that Couchbase teaches as one of its strenghts.

Original title and link: Zynga Deploys MemSQL for Real-Time Service. Where Does This Leave Couchbase? (NoSQL database©myNoSQL)

via: http://gigaom.com/2013/01/18/can-a-new-database-help-get-zynga-back-on-track/


11 Interesting Releases From the First Weeks of January

The list of releases I wanted to post about has been growing fast these last couple of weeks, so instead of waiting leaving it to Here it is (in no particular order1):

  1. (Jan.2nd) Cassandra 1.2 — announcement on DataStax’s blog. I’m currently learning and working on a post looking at what’s new in Cassandra 1.2.
  2. (Jan.10th) Apache Pig 0.10.1 — Hortonworks wrote about it
  3. (Jan.10th) DataStax Community Edition 1.2 and OpsCenter 2.1.3 — DataStax announcement
  4. (Jan.10th) CouchDB 1.0.4, 1.1.2, and 1.2.1 — releases fixing some security vulnerabilities
  5. (Jan.11th) MongoDB 2.3.2 unstable — announcement. This dev release includes support for full text indexing. For more details you can check:

    […] an open source project extending Hadoop and Hive with a collection of useful user-defined-functions. Its aim is to make the Hive Big Data developer more productive, and to enable scalable and robust dataflows.


  1. I’ve tried to order it chronologically, but most probably I’ve failed. 

Original title and link: 11 Interesting Releases From the First Weeks of January (NoSQL database©myNoSQL)


The Cloud Is Broken for MongoDB

The cloud is broken. It’s not designed to properly run persistent data stores like MongoDB. ObjectRocket is designed from the ground up to fix this problem.

Any snarky comments fit perfectly.

James Watters

Original title and link: The Cloud Is Broken for MongoDB (NoSQL database©myNoSQL)

via: http://www.objectrocket.com/details


RavenDB Bulk Inserts: Implementation Details

Ayende Rahien:

We stream the results to the server directly, so while the client is still sending results, we are already flushing them to disk.

To make things even more interesting, we aren’t using standard GZip compression over the whole request. Instead, each batch is compressed independently, which means we don’t have a dependency on the internals of the compression routine internal buffering system, etc. It also means that we get each batch much faster.

There are, of course, rate limits built in, to protect ourselves from flooding the buffers, but for the most part, you will have hard time hitting them.

Bulk inserts and data import are two interesting topics in the world of NoSQL databases where there are no ACID guarantees. What is the state of the databases if data stream is cut midway? What is the state of the database if the import fails midway? What is the state of the database if some insert/update operations fail? I’m not aware of any good answers for these possible issues.

Original title and link: RavenDB Bulk Inserts: Implementation Details (NoSQL database©myNoSQL)

via: http://ayende.com/blog/160547/implementation-details-ravendb-bulk-inserts


How to Monitor MongoDB

A post by Pandora FMS team about monitoring options for MongoDB:

If MongoDB goes wrong, all your apps will fail. So monitoring the main variables and configuration parameters of your database is the best option to make sure that your values are right and your users are happy.

The post talks about the tools that come with MongoDB (mongotop, mongostat, the stats available through the MongoDB shell, logfiles, etc.), but also introduces their PandoraFMS library. There’s no word about 10gen’s hosted MongoDB Monitoring Service, nor other MongoDB utilities for monitoring or the latest MongoMem collection memory usage library.

In terms of what are the most interesting stats, Simon Maynard’s 5 things to Monitor in MongoDB and these other 3 metrics should be a good start.

Original title and link: How to Monitor MongoDB (NoSQL database©myNoSQL)

via: http://blog.pandorafms.org/?p=821&buffer_share=0406c


A Comparison of 7 Graph Databases

The main page of InfiniteGraph, a graph database commercialized by Objectivity, features an interesting comparison of 7 graph databases (InfiniteGraph, Neo4j, AllegroGraph, Titan, FlockDB, Dex, OrientDB) based on 16 criteria: licensing, source, scalability, graph model, schema model, API, query method, platforms, consistency, concurrency (distributed processing), partitioning, extensibility, visualizing tools, storage back end/persistency, language, backup/restore.

7 graph databases

Unfortunately the image is almost unreadable, but Peter Karussell has extracted the data in a GoogleDoc spreadsheet embedded below.


MongoDB Text Search Tutorial

Today is the day of the experimental MongoDB text search feature. Tobias Trelle continues his posts about this feature providing some examples for query syntax (negation, phrase search)—according to the previous post even more advanced queries should be supported, filtering and projections, multiple text fields indexing, and adding details about the stemming solution used (Snowball).

In case you missed the previous posts, here is a quick link list:

  1. MongoDB Full Text Search Explained
  2. Full text search in MongoDB: details about supported languages and queries
  3. Short demo of MongoDB text search and hashed shard keys
  4. Indexing a Markdown blog using MongoDB Full Text Indexing

Original title and link: MongoDB Text Search Tutorial (NoSQL database©myNoSQL)

via: http://blog.codecentric.de/en/2013/01/mongodb-text-search-tutorial/


Indexing a Markdown Blog With MongoDB Full Text Search

A. Jesse Jiryu Davis uses the recently announced MongoDB full text search to index his Makrdown based blog:

The blog had been using a really terrible method for search, involving regular expressions, a full collection scan for every search, and no ranking of results by relevance. I wanted to replace all that cruft with MongoDB’s full-text search ASAP. Here’s what I did.

This looks nice, but I’d like to see how well it works. And there’s one thing that I don’t understand: why parsing the HTML when the source text is already in Markdown?

Original title and link: Indexing a Markdown Blog With MongoDB Full Text Search (NoSQL database©myNoSQL)

via: http://emptysquare.net/blog/mongodb-full-text-search/


Short Demo of MongoDB Text Search and Hashed Shard Keys

Staying on the subject of MongoDB full text search—see here and here—a 10 minutes demo of the new feature:

Original title and link: Short Demo of MongoDB Text Search and Hashed Shard Keys (NoSQL database©myNoSQL)


Full Text Search in MongoDB: Details About Languages and Queries

Another post about the upcoming MongoDB full text search, this one adds some more details about supported languages and queries:

  • Support for Latin based languages initially, with plans for other character sets later. Initially this will be: Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Norwegian, Portuguese, Romanian, Russian, Spanish, Swedish and Turkish.
  • Support for advanced queries, similar to the Google search syntax e.g. negation and phrase matching.

It’s worth emphasizing that the post refers to character sets when speaking about supported languages, but not about stemming which differs for many of those.

Original title and link: Full Text Search in MongoDB: Details About Languages and Queries (NoSQL database©myNoSQL)

via: http://blog.serverdensity.com/full-text-search-in-mongodb/


MongoDB Full Text Search Explained

Tobias Trelle explains the features planned for the full text support coming in MongoDB 2.4: stop words, (basic) stemming, full text indexes, and API:

The upcoming release 2.4 of MongoDB will include a first, experimental support for full text search (FTS). This feature was requested early in the history of MongoDB as you can see from this JIRA ticket: SERVER-380. FTS is first available with the developer release 2.3.2.

Couple of reasons for MongoDB including full text search:

  1. highly requested feature (239 votes, 193 watchers, 42 participants)
  2. (high level) feature parity with MySQL
  3. NIH

The majority of databases support full text indexing, but almost everyone needing good full text search ends up using Lucene or Solr or Elastic Search or Sphinx.

Original title and link: MongoDB Full Text Search Explained (NoSQL database©myNoSQL)

via: http://blog.codecentric.de/en/2013/01/text-search-mongodb-stemming/