NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



PostgreSQL: All content tagged as PostgreSQL in NoSQL databases and polyglot persistence

New PostgreSQL guns for NoSQL market

Joab Jackson (PCWorld):

Embracing the widely used JSON data-exchange format, the new version of the PostgreSQL open-source database takes aim at the growing NoSQL market of nonrelational data stores, notably the popular MongoDB.

I’ve always appreciated the openness of the PostgreSQL developers to consider new features and their efforts to bring these to a relational database. What’s missing from the picture is how many users are actually using these features.

Original title and link: New PostgreSQL guns for NoSQL market (NoSQL database©myNoSQL)


Why the clock is ticking for MongoDB

Robert Haas takes a comparative look at PostgreSQL and MongoDB’s features emphasized by its MongoDB CEO in an interview:

Schireson also mentions another advantage of document stores: schema flexibility. Of course, he again ignores the possible advantages, for some users, of a fixed schema, such as better validity checking. But more importantly, he ignores the fact that relational databases such as PostgreSQL have had similar capabilities since before MongoDB existed. PostgreSQL’s hstore, which provides the ability to store and index collections of key-value pairs in a fashion similar to what MongoDB provides, was first released in December of 2006, the year before MongoDB development began. True JSON capabilities were added to the PostgreSQL core as part of the 9.2 release, which went GA in September of 2012. The 9.4 release, expected later this year, will greatly expand those capabilities. In today’s era of rapid innovation, any database product whose market advantage is based on the format in which it is able to store data will not retain that advantage for very long.

It’s difficult impossible to debate or contradict the majority of facts and arguments the author is making. But in order to understand the history and future of developer tools, it’s worth emphasizing one aspect that has been almost completely ignored for way too long. — and the author mentions it just briefly.

Developers want to get things done. Fast and Easy.

For too long vendors thought that a tool that had a feature covered was enough. Even if the user had to read a book or two, hire an army of consultants, postpone the deadlines, and finally make three incantations to get it working. This strategy worked well for decades. It worked especially well in the space of databases where buying decisions where made at the top level due to the humongous costs.

MySQL became one of the most popular database because it was free and perceived to be easier than any of the alternatives. Not because it was first. Not because it was feature complete. And definitely not because it was technically superior — PostgreSQL was always technically superior, but never got the install base MySQL got.

MongoDB replays this story by the book. It’s free. It promises features that were missing or are considered complicated in the other products. And it’s perceived as the easiest to use database — a look at MongoDB’s history will reveal immediately its primary focus on ease of use: great documentation, friendly setup, fast getting started experience. For a lot of people, it really doesn’t matter anymore that there are alternative solutions that offer technically superior solutions. They’ve got their things done. Fast and Easy. Tomorrow is another day.

Original title and link: Why the clock is ticking for MongoDB (NoSQL database©myNoSQL)


Two more things about PostgreSQL

  1. PostgreSQL 9.4 will replace the default JSON column type with Hstore.

    From InfoQ:

    PostgreSQL 9.4 will be reintroducing Hstore as the column type of choice for document-style data. This supersedes PostgreSQL’s JSON support which was introduced in version 9.0. Being a string-based representation, JSON is significantly slower than the binary structure of HStore. And with the addition of Boolean and integer support, the new Hstore is semantically equivalent to JSON. In practical terms this allows two-way conversions between the formats using just a casting operator.

  2. Amazon RDS adds (beta) support for PostgreSQL.. Pretty much everything is supported, including PostGIS.

Original title and link: Two more things about PostgreSQL (NoSQL database©myNoSQL)

Heroku Postgres Rollback

New features available on Heroku’s hosted PostgreSQL

Heroku Postgres rollback allows you to “roll back” the state of your database to a previous point in time, just as heroku releases:rollback allows you to roll back to an older deployment of your application. Rollback does not affect your primary database, but instead follows the same pattern as fork: it provisions a new database that is not directly connected to the primary in any way. Like a fork, a rollback will take some time to become available.

I’d really love to know how this is done1.

  1. I have some vague ideas, but it’s better to learn than to speculate. 

Original title and link: Heroku Postgres Rollback (NoSQL database©myNoSQL)


PostgreSQL and the NoSQL world

As I linked earlier today to the MemSQL and JSON story, I’ve thought again about PostgreSQL and its community approach of bringing in new features. It’s hard to miss what they are doing. And I think they are doing it right.

The PostgreSQL community is looking outside the box and listens. What features are users of NoSQL databases most excited about? Can we offer native support for them? Can we integrate with these other tools? These are the right questions to ask when considering expanding outside your space to help your users.

PostgreSQL 9.2 introduced a JSON data type and JSON functions and operators. Here’s what I wrote about when linking to a post about the augmented support for the JSON data type.

PostgreSQL 9.3 added a Javascript engine (V8) bringing even more power to the JSON data type. Also in PostgreSQL 9.3 there’s support for foreign data wrappers: a feature allowing to query from PostgreSQL external data sources.

While it might sound easy to watch what others are doing and then do it yourself—this is probably well-known as the Microsoft strategy—the reality is there’s a lot of complexity of following this strategy. Besides asking the right questions when picking what features to bring in, there are always the technical and design decisions:

  1. can we actually support this?
  2. can we support it in a way that’ll not break or impact negatively existing features?
  3. how should we expose these “imported” features so we make them appealing to existing users (with their vision of the product), while keeping them attractive and familiar to new users?

The last question is the most difficult to come with the right answers.

✚ Here’s also a post I’ve linked to showing how to use PostgreSQL as a schemaless database.

Original title and link: PostgreSQL and the NoSQL world (NoSQL database©myNoSQL)

PostgreSQL as NoSQL with Data Validation

Szymon Guz writes about JSON support in PostgreSQL:

So, I’ve shown you how you can use PostgreSQL as a simple NoSQL database storing JSON blobs of text. The great advantage over the simple NoSQL databases storing blobs is that you can constrain the blobs, so they are always correct and you shouldn’t have any problems with parsing and getting them from the database.

You can also query the database very easily, with huge speed. The ad-hoc queries are really simple, much simpler than the map-reduce queries which are needed in many NoSQL databases.

Since before NoSQL was called NoSQL, I’ve always thought that there’s a market, and more important, there are use cases for using single, unitary platforms for handling data. But there’s also a market, and the corresponding uses cases, for using different platforms for handling data. And there’s also the federated database systems and the logical data warehouses.

✚ I have this dream about how the databases will look in the future, but I never get around to putting together all the pieces, crossing the t’s and dotting the i’s.

Original title and link: PostgreSQL as NoSQL with Data Validation (NoSQL database©myNoSQL)


PostgreSQL Transaction System

This is a gem.

Original title and link: PostgreSQL Transaction System (NoSQL database©myNoSQL)

PosgreSQL as a Schemaless Database

A very interesting set of slides from Christophe Pettus looking at the features in PosgreSQL that would allow one to use it as a document database:

  1. XML
    1. built-in type
    2. can handle very large documents (2GB)
    3. XPath support
    4. export functions
    5. no indexing, except defining custom ones using expression index
  2. hstore
    1. hierarchical storage type
    2. in contrib (not part of the core)
    3. custom functions (nb: very ugly syntax imo)
    4. GiST and GIN indexes (nb: I’ve posted in the past about PostgreSQL GiST and GIN Index Types)
    5. supports also expression indexes
  3. JSON
    1. built-in type starting with PostgreSQL 9.2
    2. validates JSON
    3. support expression indexing
    4. nothing else besides a lot of feature scheduled for

Christophe Pettus’s slides also include the results and some thoughts about a locally-run pseudo-benchmark against these engines and MongoDB.

You can see all the slides and download them after the break.

Original title and link: PosgreSQL as a Schemaless Database (NoSQL database©myNoSQL)

Extra Security Measures for Database Projects

This means carying about your users’ data:

What we intend to do is shut off updates from the master git repo to the anonymous-git mirror, and to github, from Monday afternoon until Thursday morning. Commit-log emails to pgsql-committers will also be held for this period. This will prevent the commits that fix and document the bug from becoming visible to anyone except Postgres committers. Updates will resume as soon as the release announcement is made.

Original title and link: Extra Security Measures for Database Projects (NoSQL database©myNoSQL)


Cage Match: MySQL vs NoSQL vs Postgres

A post by Brain Aker about the state of MySQL, Postgres and NoSQL databases.

I had a couple of comments and these evolved into a long rant.

MySQL became less interesting once it was acquired […]

I’ve never been very sure what metric is used to measure how interesting a product is. That in case there’s such a metric. As opposed to some suggestions I’m reading, I haven’t seen stories of people moving away from MySQL because Oracle acquired it. Except Fedora and OpenSUSE replacing MySQL with MariaDB and this due to very specific issues (no security infos, no access to regression tests).

the number of Postgres deployments is greater then what all of the NoSQL market combined adds up to

Comparing 15 years of PosgreSQL with 3 years of NoSQL isn’t going to give meaningful results (for a similar unbalanced comparisons try Oracle vs PostgreSQL). I’m not aware of any database that captured a significant market share in the first 3 years of its existance. Except MySQL. Not Postgres.

Would a document model really matter if schemas could be altered online?

Yes, it would definitely remain relevant. Schema flexibility is not only about updating it, but also about the types allowed. PostgreSQL has indeed added support for arrays and JSON. I see this as a confirmation of what’s happening in the NoSQL space and also about the future of storage engines.

no new language has emerged from the NoSQL market that has any size-able adoption

MongoDB’s query language and the aggregation framework are used by a lot of people. It’s probably not the ideal query language and it comes in two different flavors, but it’s there and it’ll most probably evolve. Biasedly, I could also point to RethinkDB’s data manipulation language for an example of something that is probably on par with SQL and without the hidden unknown corner cases of SQL. Indeed none of these can come close the the adoption acquired by SQL in its 30 years of existance.

Bottom line is that I expect bridges to be built between relational databases and NoSQL databases and each side adopting those features that are useful to their users. I also expect that slowly this relational databases are crap vs NoSQL databases are crap debate will go away, people realizing that the data space is not a zero sum game. Vendors will be the last to give up this fight, but customers have a lot of power in making this happen.

Original title and link: Cage Match: MySQL vs NoSQL vs Postgres (NoSQL database©myNoSQL)


Handling Growth With Postgres: 5 Tips From Instagram

As we’ve scaled Instagram to an ever-growing number of active users, Postgres has continued to be our solid foundation and the canonical data storage for most of the data created by our users. While less than a year ago, we blogged about how we “stored a lot of data” at Instagram at 90 likes per second, we’re now pushing over 10,000 likes per second at peak—and our fundamental storage technology hasn’t changed.

I only knew about the fifth one and I think the 2 tips about partial and functional indexes being extremely useful in general.

Original title and link: Handling Growth With Postgres: 5 Tips From Instagram (NoSQL database©myNoSQL)


PostgreSQL GiST and GIN Index Types

Triggered by the improvements for PostgreSQL coming with ActiveRecord in Rails 4, today I’ve learned about PostgreSQL GiST and GIT index types:

In choosing which index type to use, GiST or GIN, consider these performance differences:

  • GIN index lookups are about three times faster than GiST
  • GIN indexes take about three times longer to build than GiST
  • GIN indexes are moderately slower to update than GiST indexes, but about 10 times slower if fast-update support was disabled
  • GIN indexes are two-to-three times larger than GiST indexes

As a rule of thumb, GIN indexes are best for static data because lookups are faster. For dynamic data, GiST indexes are faster to update. Specifically, GiST indexes are very good for dynamic data and fast if the number of unique words (lexemes) is under 100,000, while GIN indexes will handle 100,000+ lexemes better but are slower to update.

Original title and link: PostgreSQL GiST and GIN Index Types (NoSQL database©myNoSQL)