document database: All content tagged as document database in NoSQL databases and polyglot persistence
Wednesday, 1 February 2012
MongoDB at GOV.UK: The Power of the Document Model
The alpha version of GOV.UK was using MySQL and PostgreSQL. GOV.UK beta is based on Amazon RDS (MySQL) and MongoDB. In there words:
We started out building everything using MySQL but moved to MongoDB as we realised how much of our content fitted its document-centric approach. Over time we’ve been more and more impressed with it and expect to increase our usage of it in the future.
Here’s how GOV.UK architecture looks like:

Original title and link: MongoDB at GOV.UK: The Power of the Document Model (©myNoSQL)
Monday, 23 January 2012
Key-Value Stores, Document Databases, and Column Stores as Aggregate Oriented Databases
A different, unified look at the data model of the key-value stores, document databases, and column-family stores from Martin Fowler:
there’s a big similarity between the first three - all have a fundamental unit of storage which is a rich structure of closely related data: for key-value stores it’s the value, for document stores it’s the document, and for column-family stores it’s the column family. In DDD terms, this group of data is an aggregate.
The aggregate approach was present in the relational databases world for quite a while. It came in two flavors: views and denormalization. The first one worked well for non-distributed deployments, while the second is used everywhere the speed or the usage of joins was not an option.
Original title and link: Key-Value Stores, Document Databases, and Column Stores as Aggregate Oriented Databases (©myNoSQL)
via: http://java.dzone.com/articles/aggregate-oriented-database
Wednesday, 18 January 2012
Implementing Auto Saves Using RavenDB: NoSQL Tutorials
[…] implementing Auto Save in the RDBMS system could be a problem because of multiple reasons:
- The schema and overall logic changes to save versioned data in the RDBMS system will be non-trivial
- There might be validation checks that fail because users kept didn’t fill out some fields at that point.
- Making periodic (30 second) transactional updates to any live system is not good for overall performance.
A work around would be saving your Object Model to RavenDB directly and if user visits the document after a time out, load both Transactional Data and Object data, compare the timestamp and use the freshest set of data.
By far the best document database usecase I have read about in quite a while.
Original title and link: Implementing Auto Saves Using RavenDB: NoSQL Tutorials (©myNoSQL)
Thursday, 5 January 2012
6 Ways to Handle Relations in RavenDB and Document Databases
Daniel Lang presents 6 solutions for dealing with relations in RavenDB:
If you’re coming from the sql world, chances are you will be confused by the lack of relations in document databases. However, if you’re running RavenDB you’ve got plenty of options to address this trade-off. I personally cannot think of any situation where I’d wish back SQLServer because of this (there could be other reasons).
Two not recommended:
- go to the database twice
- include one document inside the other
Two RavenDB specific solutions:
- implement a read trigger to do server-side joins
- implement a custom responder
Two recommended solutions:
- use the
.Include<T>()method - denormalize your references
Couple of comments:
- the difference between “include one document inside the other” and “denormalize your references” is very subtle—the latter suggests including only the information needed for the presentation layer.
- I think one should consider both “include one document inside the other” and “denormalize your references” and choose one of them depending on the chances of the embedded documents being updated often vs the chances of having the presentation layer changing often
- except RavenDB, all other document databases seem to offer only two options: “go to the database twice” and “denormalize your references”
- when Redis will release its version embedding server-side Lua, that could be used as a form of stored procedure
Original title and link: 6 Ways to Handle Relations in RavenDB and Document Databases (©myNoSQL)
via: http://daniellang.net/how-to-handle-relations-in-ravendb/
Monday, 19 December 2011
Data Modeling for Document Databases: An Auction and Bids System
Staying with data modeling, but moving to the world of document databases, Ayende has two great posts about modeling an auction system: part 1 and part 2. They are great not only because it’s not the Human-has-Bird-and-Cat-and-Dogs example, but also because he looks at different sets of requirements and offers different solutions.
That is one model for an Auction site, but another one would be a much stronger scenario, where you can’t just accept any Bid. It might be a system where you are charged per bid, so accepting a known invalid bid is not allowed (if you were outbid in the meantime). How would we build such a system? We can still use the previous design, and just defer the actual billing for a later stage, but let us assume that this is a strong constraint on the system.
Original title and link: Data Modeling for Document Databases: An Auction and Bids System (©myNoSQL)
Document Databases and Data Migrations
Sven Schmidt:
Since CouchDb is inherently unstructured, there’s no global schema that you manage to control your data’s structure. That’s often a good thing, because it gives you flexibility, but it can also cause problems, for example when you want to access documents without handling against all sorts of different “versions” of your document you might have.
When we first talked about document databases we said:
- no more ORM. Do a search or take a quick look at this list of NoSQL libraries and see if that still stands.
- no more schema constraints. The moment the structure of the data showed signs of evolving too rapidly, we started to look for ways to test document structure for inconsistency
- no more data migrations. Maybe no data migrations, but data versioning might be needed long term.
I think I’ve already written this once, but here it is again: the sum of constraints in a system is constant. The more relaxed the rules are on a component, the more constraints the rest of the components will need to support.
Original title and link: Document Databases and Data Migrations (©myNoSQL)
via: http://catbrainblog.wordpress.com/2011/12/19/couchdb-migrations/
Monday, 12 December 2011
PostgreSQL Hstore: The Key Value Store Everyone Ignored
A post rehashing PostgreSQL hstore capabilities:
I will be focusing on a key value store that is ACID compliant for real! Postgres takes advantage of its storage engine and has an extension on top for key value storage. So plan is to have a table can have a column that has a datatype of hstore; which in turn has a structure free storage. Thinking of this model multiple analogies throw themselves in. It can be a Column Family Store just like Cassandra where row key can be PK of the table, and each column of hstore type in table can be imagined like a super column, and each key in the hstore entry can be a column name. Similarly you can imagine it some what like Hash structures in Redis (HSET, HDEL), or 2 or 3 level MongoDB store (few modifications required). Despite being similar (when little tricks are applied) to your NoSQL store structures, this gives me an opportunity to demonstrate you some really trivial examples.
A couple of comments:
- you can store key-value pairs in any relational database
- there are quite a few ACID key-value stores available
- hstore is more like a document store. Values are not opaque and it supports queries against them.
- not everyone needs a document database when a key-value store is enough. The most common example is storing web sessions.
- not everyone needs an ACID compliant database. Not in a distributed system requiring high availability.
Anyway, the conclusion remains the same.
Update: there’s a long thread discussing this post on Hacker News .
Original title and link: PostgreSQL Hstore: The Key Value Store Everyone Ignored (©myNoSQL)
via: http://blog.creapptives.com/post/14062057061/the-key-value-store-everyone-ignored-postgresql
Thursday, 8 December 2011
NoSQL Document Databases: Testing Your Document Structure for Inconsistencies
One of the advantages with schema-less design is that it works well for prototyping; you can have a collection of documents with each of the documents of variable structure. You can modify the document structure for one, some or all documents within the collection all without requiring a schema for the collection or each and every document.However, this is also a disadvantage during prototyping; there are no constraints to stop documents within the same collection having variable structure
Rules of the game are simple: The sum of all constraints in a system is constant. The more relaxed one of the component is, the more validations other components must perform.
Original title and link: NoSQL Document Databases: Testing Your Document Structure for Inconsistencies (©myNoSQL)
The Durable Document Store You Didn't Know You Had, but Did
As it turns out, PostgreSQL has a number of ways of storing loosely structured data/documents in a column on a table.
- hstore is a data type available as a contrib package that allows you to store key/value structures just like a dictionary or hash.
- You can store data in JSON format on a text field, and then use PLV8 to JSON.parse() it right in the database.
- There is a native xml data type, along with a few interesting query functions that allow you to extract and operate on data that sits deep in an XML structure.
I concur. Not knowing your database *must not* be the reason for adopting a NoSQL database.
Original title and link: The Durable Document Store You Didn’t Know You Had, but Did (©myNoSQL)
Tuesday, 6 December 2011
NoSQL Databases Best Practices and Emerging Trends
Jans Aasman (CEO AllegroGraph) interviewed by Srini Penchikala:
InfoQ: What best practices and architecture patterns should the developers and architects consider when using a solution like this one in their software applications?
Jans: If your application requires simple straight joins and your schema hardly changes then any RDBM will do.
If your application is mostly document based, where a document can be looked at as a pre-joined nested tree (think a Facebook page, think a nested JSON object) and where you don’t want to be limited by an RDB schema then key-value stores and document stores like MongoDB are a good alternative.
If you want what is described in the previous paragraph but you have to perform complex joins or apply graph algorithms then the MongoGraph approach might be a viable solution.
Thinking about the products and projects I’ve been working on, most of them have had to deal with all these aspects in different areas of the applications and with different importance to the final solution. Mistakenly though, in most of the cases they ended up using a relational database only. With polyglot persistence here, this shouldn’t happen anymore. That’s not to say though that every project must use all of these technologies just because they are available. But it could use any of them or all combined.
InfoQ: What are the emerging trends in combining the NoSQL data stores?
Jans: From the perspective of a Semantic Web - Graph database vendor what we see is that nearly all graph databases now perform their text indexing with Lucene based indexing (Solr or Elastic Search) and I wouldn’t be surprised that most vendors soon will allow JSON objects as first class objects for graph databases. It was surprisingly straightforward to mix the JSON and triple/graph paradigm. We are also experimenting with key-value stores to see how that mixes with the triple/graph paradigm.
This topic was also discussed during my NoSQL Applications panel, but due to a panel time constraints we couldn’t reach a conclusion. But it’s definitely an interesting perspective.
Original title and link: NoSQL Databases Best Practices and Emerging Trends (©myNoSQL)
11 Document-Oriented Databases Which Are 8: CouchDB, Jackrabbit, MongoDB, RavenDB
Such list would be even more useful with the following classification:
Production ready
Experimental
Note: A special mention in this category for OrientDB and Terrastore which even if they might not be largely adopted they are still active projects probably counting a couple of production deployments.
Abandonware
Original title and link: 11 Document-Oriented Databases Which Are 8: CouchDB, Jackrabbit, MongoDB, RavenDB (©myNoSQL)
MongoDB, Data Modeling, and Adoption
Micheal Shallop describes in this post how he “built and re-buit” a geospatial table, replacing several tables in MySQL with MongoDB:
The mongo geospatial repository will be replacing several tables in the legacy mySQL system – as you may know, mongodb comes with full geospatial support so executing queries against a collection (table) built in this manner is shocking in terms of it’s response speeds — especially when you compare those speeds to the traditional mySQL algorithms for extracting geo-points based on distance ranges for lat/lon coordinates. The tl;dr for this paragraph is: no more hideous trigonometric mySQL queries!
But what actually picked my attention was this paragraph:
What I learned in this exercise was that the key to architecting a mongo collection requires you to re-think how data is stored. Mongo stores data as a collection of documents. The key to successful thinking, at least in terms of mongo storage, is denormalization of your data objects.
This made me realize that MongoDB adoption is benefiting hugely from the fact that its data model and querying are the closest to the relational databases, neither requiring a radical mindshift from developers that have at least once touched a database. It is like knowing a programming language and learning a 2nd one that follows almost the same paradigms.
The same cannot be said about key-value stores, multi-dimensional maps, MapReduce algorithms, or graph databases. Any of these would require one to dismiss pretty much everything learned in the relational model and completely remodel the world. It’s a tougher job, but when used right the reward pays off.
Original title and link: MongoDB, Data Modeling, and Adoption (©myNoSQL)
Most Popular Articles
- Translate SQL to MongoDB MapReduce
- Tutorial: Getting Started With Cassandra
- CouchDB vs MongoDB: An attempt for a More Informed Comparison
- Cassandra @ Twitter: An Interview with Ryan King
- A Couple of Nice GUI Tools for MongoDB
- NoSQL benchmarks and performance evaluations
- Ehcache: Distributed Cache or NoSQL Store?
- Document Databases Compared: CouchDB, MongoDB, RavenDB
- Quick Review of Existing Graph Databases
- NoSQL Data Modeling