NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



SSD: All content tagged as SSD in NoSQL databases and polyglot persistence

De-Confusing SSD

Great post by Pythian’s Gwen Shapira explaining different aspects of SSD:

SSD is fast for reads, but not for writes. Its fast for random writes, but not for sequential writes. You shouldn’t use it for redo, except that Oracle do that on their appliances. SSD gets slow over time. SSD has a limited lifespan and is unreliable. Performance depends on exactly which SSD you use. You can have PCI or SATA or even SAN. You can use SSD for flash cache, but only specific versions, maybe. You can have MLC or SLC. It can be enterprise or home grade.

Remember that storing your data shouldn’t be a black-and-white decision on using RAM vs SSD vs spinning disks.

Original title and link: De-Confusing SSD (NoSQL database©myNoSQL)


A Preview of Future Disk Drives

A new type of data storage technology, called phase-change memory, has proven capable of writing some types of data faster than conventional flash based storage. The tests used a hard drive based on prototype phase-change memory chips.

The key phrase throughout the article is “some types of data“. This sounds like a validation of what Jeff Darcy has written about SSD, RAM, and spinning disks.

Original title and link: A Preview of Future Disk Drives (NoSQL database©myNoSQL)


Solid State Silliness

Jeff Darcy:

[…] progress always comes from those who actually think about how to use the Hot New Thing to complement other approaches instead of expecting one to supplant the other completely. So it is with SSDs, which are a great addition to the data-storage arsenal but cannot reasonably be used as a direct substitute either for RAM at one end of the spectrum or for spinning disks at the other. Instead of putting all data on SSDs, we should be thinking about how to put the right data on them. As it turns out, there are several levels at which this can be done.

This sounds more reasonable to me than the whole RAM is the new disk direction some products are evangelizing.

Original title and link: Solid State Silliness (NoSQL database©myNoSQL)


RethinkDB: On TRIM, NCQ, and Write Amplification

Closing the circle:

RethinkDB gets around these issues in the following way. We identified over a dozen parameters that affect the performance of any given drive (for example, block size, stride, timing, etc.) We have a benchmarking engine that treats the underlying storage system as a black box and brute forces through many hundreds of permutations of these parameters to find an ideal workload for the underlying drive.

Original title and link: RethinkDB: On TRIM, NCQ, and Write Amplification (NoSQL databases © myNoSQL)


RethinkDB and SSD Write Performance

I didn’t know too much about RethinkDB until watching Tim Anglade’s interview with Slava Akhmechet and Mike Glukhovsky. There were mainly three things that caught my attention:

  1. RethinkDB is firstly building a persistent memcached compatible solution to work with SSD. The reason for starting with a memcached-compatible system is that building it is much simpler than implementing a MySQL storage engine. On the other hand I think that having a persistent memcached might bring RethinkDB some customers to validate the technology.

    Even if announced in 8-10 weeks at the time of the interview, I don’t think this implementation has been launched yet. Update: according to Tim, RethinkDB technology has been available to private beta users for a while now. But I still couldn’t find any reference to it on either the website or blog.

  2. Next will come a MySQL engine optimized for SSD

  3. Replacing rotational disks with SSD shows an immediate bump in performance. But shortly after (months) performance seriously degrades.

It is this last point that I haven’t heard before. And I’d really be interested to understand:

  • if it applies to all scenarios or if it is related to databases in general
  • are there specific database scenarios (access patterns, read/write ratios) that lead to this behavior or will it manifest in general cases too

My current assumption is that this behavior occurs for write intensive databases only. But I’d really like to hear some better documented answers.

Update: First answer I got to the above questions comes from Travis Truman: The SSD Anthology: Understanding SSDs and New Drives from OCZ.

Update: RethinkDB guys have published a follow up: On TRIM, NCQ, and write amplification.

Original title and link: RethinkDB and SSD Write Performance (NoSQL databases © myNoSQL)