ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

simpledb: All content tagged as simpledb in NoSQL databases and polyglot persistence

From SimpleDB to Cassandra: Data Migration for a High Volume Web Application at Netflix

Prasanna Padmanabhan and Shashi Madapp posted an article on the Netflix blog describing the process used to migrate data from Amazon SimpleDB to Cassandra:

There will come a time in the life of most systems serving data, when there is a need to migrate data to a more reliable, scalable and high performance data store while maintaining or improving data consistency, latency and efficiency. This document explains the data migration technique we used at Netflix to migrate the user’s queue data between two different distributed NoSQL storage systems.

The steps involved are what you’d expect for a large data set migration:

  1. forklift
  2. incremental replication
  3. consistency checking
  4. shadow writes
  5. shadow writes and shadow reads for validation
  6. end of life of the original data store (SimpleDB)

If you think of it, this is how a distributed, eventually consistent storage works (at least in big lines) when replicating data across the cluster. The main difference is that inside a storage engine you deal with a homogeneous system with a single set of constraints, while data migration has to deal with heterogenous systems most often characterized by different limitations and behavior.

In 2009, Netflix performed a similar massive data migration operation. At that time it involved moving data from its own hosted Oracle and MySQL databases to SimpleDB. The challenges of operating this hybrid solution were described in a the paper Netflix’s Transition to High-Availability Storage Systems authored by Sid Anand.

Sid Anand is now working at LinkedIn where they use Databus for low latency data transfer. But Databus’s approach is very similar.

Original title and link: From SimpleDB to Cassandra: Data Migration for a High Volume Web Application at Netflix (NoSQL database©myNoSQL)

via: http://techblog.netflix.com/2013/02/netflix-queue-data-migration-for-high.html?m=1


NoSQL and Relational Databases Podcast With Mathias Meyer

EngineYard’s Ines Sombra recorded a conversation with Mathias Meyer about NoSQL databases and their evolution towards more friendlier functionality, relational databases and their steps towards non-relational models, and a bit more on what polyglot persistence means.

Mathias Meyer is one of the people I could talk for days about NoSQL and databases in general with different infrastructure toppings and he has some of the most well balanced thoughts when speaking about this exciting space—see this conversation I’ve had with him in the early days of NoSQL. I strongly encourage you to download the mp3 and listen to it.

Original title and link: NoSQL and Relational Databases Podcast With Mathias Meyer (NoSQL database©myNoSQL)


Which NoSQL Databases Are Robust to Net-Splits?

Answered on Quora:

  • Dynamo (key-value)
  • Voldemort (key-value)
  • Tokyo Cabinet (key-value)
  • KAI (key-value)
  • Cassandra (column-oriented/tabular)
  • CouchDB (document-oriented)
  • SimpleDB (document-oriented)
  • Riak (document-oriented)

A couple of clarifications to the list above:

  1. Dynamo has never been available to the public. On the other hand DynamoDB is not exactly Dynamo
  2. Tokyo Cabinet is not a distributed database so it shouldn’t be in this list
  3. CouchDB isn’t a distributed database either, but one could argue that with its peer-to-peer replication it sits right at the border. On the other hand there’s BigCouch.

Original title and link: Which NoSQL Databases Are Robust to Net-Splits? (NoSQL database©myNoSQL)


NoSQL Hosting Services

Michael Hausenblas put together a list of hosted NoSQL solutions including Amazon DynamoDB and SimpleDB, Google App Engine, Riak, Cassandra, CouchDB, MongoDB, Neo4j, and OrientDB. If you go through my posts on NoSQL hosting , you’ll find a couple more.

Original title and link: NoSQL Hosting Services (NoSQL database©myNoSQL)

via: http://webofdata.wordpress.com/2012/03/18/hosted-nosql/


Cross-Platform Global High-Score Using Amazon SimpleDB

Timo Fleisch provides a set of requirements for which using Amazon SimpleDB for storing mobile gaming data sound like a good solution:

In this post I am going to describe my solution for a simple global high score that works with WP7, iOS and Android. […] The preconditions for me for the global high score where:

  • It should be easy and fast to implement and maintain.
  • It should use standard web technologies.
  • It should be scaleable.
  • It should use the standard web http protocol.
  • It should be secure,
  • and it should be as cross platform as possible.

SimpleDB definitely fits the bill for these requirements. But there might be some other details that could lead to using a different approach or making things a tad more complex:

  • the Amazon SimpleDB latency
  • the always-connected to the internet requirement

Original title and link: Cross-Platform Global High-Score Using Amazon SimpleDB (NoSQL database©myNoSQL)

via: http://klutzgames.blogspot.com/2012/03/cross-platform-global-high-score-using.html


Get them by the data

Gavin Clarke and Chris Mellor about AWS Storage Gateway:

Once you’ve got them by the data, of course, their hearts and minds will follow, and Amazon’s using the AWS Storage Gateway beta as a sampler for the rest of its compute cloud.

The Storage Gateway is another piece, together with S3, DynamoDB, SimpleDB, Elastic MapReduce, in Amazon’s great strategical puzzle of a complete polyglot platform.

Original title and link: Get them by the data (NoSQL database©myNoSQL)

via: http://www.theregister.co.uk/2012/01/25/amazon_cloud_enterprise_storage/


Thoughts on SimpleDB, DynamoDB and Cassandra

Adrian Cockcroft:

So the lesson here is that for a first step into NoSQL, we went with a hosted solution so that we didn’t have to build a team of experts to run it, and we didn’t have to decide in advance how much scale we needed. Starting again from scratch today, I would probably go with DynamoDB. It’s a low “friction” and developer friendly solution.

You can look at this in two ways: 1) a biased opinion of someone that has already betted on Amazon with the infrastructure of a multi-billion business; 2) the opinion of someone that has accumulated a ton of experience in the NoSQL space and that is successfully1 running the infrastructure of a multi-billion business on NoSQL solutions. I’d strongly suggest you to think of it as the latter.


  1. Netflix was one of the few companies that continued to operate during Amazon’s EBS major failure. 

Original title and link: Thoughts on SimpleDB, DynamoDB and Cassandra (NoSQL database©myNoSQL)

via: http://perfcap.blogspot.com/2012/01/thoughts-on-simpledb-dynamodb-and.html


Grails 2.0 and NoSQL

Graeme Rocher:

Grails 2.0 is the first release of Grails that truly abstracts the GORM layer so that new implementations of GORM can be used. […] The MongoDB plugin is at final release candidate stage and is based on the excellent Spring Data MongoDB project which is also available in RC form. […] Grails users can look forward to more exciting NoSQL announcements in 2012 with upcoming  future releases of GORM for Neo4j, Amazon SimpleDB and Cassandra in the works.

This is great news.

The very very big news would be a Grails version that doesn’t default anymore to using Hibernate for accessing a relational database.

Original title and link: Grails 2.0 and NoSQL (NoSQL database©myNoSQL)

via: http://blog.springsource.org/2011/12/15/grails-2-0-released/


Storing High Scores in Amazon SimpleDB

This article highlights the benefits of connecting mobile devices to the cloud while also presenting an Amazon SimpleDB use case. Amazon SimpleDB is a highly available, flexible, and scalable non-relational data store that offloads the work of database administration. The app described here demonstrates how to store a high score list or leader board in SimpleDB. The app enables the user to view the high scores sorted by name or score, add and remove scores, and more.

What it doesn’t highlight is what happens with your application if your mobile device is disconnected.

Original title and link: Storing High Scores in Amazon SimpleDB (NoSQL database©myNoSQL)

via: http://www.amazonappstoredev.com/2011/11/h1storing-high-scores-in-amazon-simpledb-using-the-aws-sdk-for-androidh1-p-this-article-highlights-the-benefits-of-co-1.html


Netflix: Run Consistency Checkers All The Time To Fixup Transactions

Todd Hoff about NoSQL and Cloud at Netflix:

You might have consistency problems if you have: multiple datastores in multiple datacenters, without distributed transactions, and with the ability to alternately execute out of each datacenter;  syncing protocols that can fail or sync stale data; distributed clients that cache data and then write old back to the central store; a NoSQL database that doesn’t have transactions between updates of multiple related key-value records; application level integrity checks; client driven optimistic locking.

Original title and link: Netflix: Run Consistency Checkers All The Time To Fixup Transactions (NoSQL databases © myNoSQL)

via: http://highscalability.com/blog/2011/4/6/netflix-run-consistency-checkers-all-the-time-to-fixup-trans.html


Amazon SimpleDB, Google Megastore & CAP

Nati Shalom (Gigaspaces) pulls out a couple of references from James Hamilton’s posts[1] on Amazon SimpleDB and Google Megastore consistency model concluding:

It is interesting to see that the reality is that even Google and Amazon - which I would consider the extreme cases for big data - realized the limitation behind eventual consistency and came up with models that can deal with scaling without forcing a compromise on consistency as I also noted in one of my recent NoCAP series

But he lefts out small details like these:

Update rates within a entity group are seriously limited by:

  • When there is log contention, one wins and the rest fail and must be retried
  • Paxos only accepts a very limited update rate (order 10^2 updates per second)

and

Cross entity group updates are supported by:

  • two-phase commit with the fragility that it brings
  • queueing ans asynchronously applying the changes

Original title and link: Amazon SimpleDB, Google Megastore & CAP (NoSQL databases © myNoSQL)

via: http://natishalom.typepad.com/nati_shaloms_blog/2011/02/a-interesting-note-on-google-megastore-cap.html