SSD: All content tagged as SSD in NoSQL databases and polyglot persistence
Thursday, 1 November 2012
Amazon EBS, SSD, and Rackspace IOPS Per Dollar
Staying on the subject of IOPS in the cloud, Jeff Darcy did some testing with GlusterFS against Amazon EBS, Amazon SSD, Storm on Demand SS, and Rackspace instance storage and computed for each IOPS/$:
- Amazon EBS: 1000 IOPS (provisioned) for $225/month or 4.4 IOPS/$ (server not included)
- Amazon SSD: 4300 IOPS for $4464/month or 1.0 IOPS/month (that’s pathetic)
- Storm on Demand SSD: 5500 IOPS for $590/month or 9.3 IOPS/$
- Rackspace instance storage: 3400 IOPS for $692/month (8GB instances) or 4.9 IOPS/$
- Rackspace with 4x block storage per server: 9600 IOPS for $811/month or 11.8 IOPS/$ (hypothetical, assuming CPU or network don’t become bottlenecks)
Original title and link: Amazon EBS, SSD, and Rackspace IOPS Per Dollar (©myNoSQL)
via: http://pl.atyp.us/wordpress/index.php/2012/10/rackspace-block-storage/
Thursday, 26 July 2012
Voldemort on Solid State Drives
LinkedIn’s experience of upgrading their Project Voldemort clusters to using SSD:
At the beginning of this year, we migrated our Voldemort clusters to SSD (Solid State Drives) from SAS (Serial Attached SCSI) disks, to meet increasing demand for IOPS from data intensive applications.
Original title and link: Voldemort on Solid State Drives (©myNoSQL)
via: http://engineering.linkedin.com/voldemort/voldemort-solid-state-drives
Saturday, 21 July 2012
Log-Structured File Systems: There's One in Every SSD
An article from 2009 by Valerie Aurora:
When you say “log-structured file system,” most storage developers will immediately think of Ousterhout and Rosenblum’s classic paper, The Design and Implementation of a Log-structured File System - and the nearly two decades of subsequent work attempting to solve the nasty segment cleaner problem (see below) that came with it. Linux developers might think of JFFS2, NILFS, or LogFS, three of several modern log-structured file systems specialized for use with solid state devices (SSDs). Few people, however, will think of SSD firmware. The flash translation layer in a modern, full-featured SSD resembles a log-structured file system in several important ways. Extrapolating from log-structured file systems research lets us predict how to get the best performance out of an SSD. In particular, full support for the TRIM command, at both the SSD and file system levels, will be key for sustaining long-term peak performance for most SSDs.
Original title and link: Log-Structured File Systems: There’s One in Every SSD (©myNoSQL)
Friday, 20 July 2012
EC2 Solid State Disks and Cassandra
Jonathan Ellis about using Cassandra with mixed spinning disks and SSDs:
Finally, I should point out that taking advantage of SSDs in a Cassandra cluster doesn’t have to be all or nothing. You can mix SSD and spinning disks either at the individual node level, or at the cluster level. For the former, Cassandra allows putting “hot” tables on SSD while leaving “cold” ones on spinning disks. But if you want to use a group of nodes for analytical workloads the way DataStax Enterprise does, Cassandra will also be comfortable with having just those nodes be entirely based on cheaper spinning disks, with the remaining, “realtime” nodes based on SSDs. This latter configuration is a good fit for EC2 deployments.
Original title and link: EC2 Solid State Disks and Cassandra (©myNoSQL)
via: http://www.datastax.com/dev/blog/solid-state-disks-now-available-on-amazon-ec2
Cassandra and Solid State Drives
A slide deck by Rick Branson explaining why and how Cassandra takes full advantage of SSDs.
Thursday, 19 July 2012
Amazon Introduces High I/O SSD-backed EC2 Instances
Jeff Barr:
In order to meet this need, we are introducing a new family of EC2 instances1 that are designed to run low-latency, I/O-intensive applications, and are an exceptionally good host for NoSQL databases such as Cassandra and MongoDB.
Many complains about running databases on EC2 instances were about the I/O. I guess Amazon has been hearing this loud and clear.
-
Specs of the new EC2 instace: ↩
- 8 virtual cores (35 ECU)
- HVM and PVM virtualization.
- 60.5 GB of RAM.
- 10 Gigabit Ethernet connectivity with support for cluster placement groups.
- 2 TB of local SSD-backed storage, visible as a pair of 1 TB volumes.
Original title and link: Amazon Introduces High I/O SSD-backed EC2 Instances (©myNoSQL)
via: http://aws.typepad.com/aws/2012/07/new-high-io-ec2-instance-type-hi14xlarge.html
Wednesday, 18 July 2012
Benchmarking High Performance I/O With SSD for Cassandra on AWS
Adrian Cockcroft:
The SSD based system running the same workload had plenty of IOPS left over and could also run compaction operations under full load without affecting response times. The overall throughput of the 12-instance SSD based system was CPU limited to about 20% less than the existing system, but with much lower mean and 99th percentile latency. This sizing exercise indicated that we could replace the 48 m2.4xlarge and 36 m2.xlarge with 15 hi1.4xlarge to get the same throughput, but with much lower latency.
Tons of details and data about the benchmarks Netflix ran against the new high I/O SSD-backed EC2 instances. Results are even more impressive than the IOPS numbers in Werner Vogel’s High performance I/O instances for EC2.
Original title and link: Benchmarking High Performance I/O With SSD for Cassandra on AWS (©myNoSQL)
via: http://techblog.netflix.com/2012/07/benchmarking-high-performance-io-with.html
High Performance I/O Instances for Amazon EC2
Werner Vogels:
Databases are one particular area that for scaling can benefit tremendously from high performance I/O. The I/O requirements of database engines, regardless whether they a Relational or Non-Relation (NoSQL) DBMS’s can be very demanding. Increasingly randomized access, and burst IO through aggregation put strains on any IO subsystem, physical or virtual, attached or remote. One area where we have seen this particularly culminate is in modern NoSQL DBMSs that are often the core of scalable modern web applications that exhibit a great deal of random access patterns. They require high replication factors to get to the aggregate random IO they require. Early users of these High I/O instances have been able to reduce their replication factors significantly while achieving rock solid performance and substantially reducing their cost in the process.
Going from around 100 IOPS for 15K RPM spinning disks to over 100000 IOPS for random reads and 10000-85000 for random writes with SSDs.
Original title and link: High Performance I/O Instances for Amazon EC2 (©myNoSQL)
via: http://www.allthingsdistributed.com/2012/07/high-performace-io-instance-amazon-ec2.html
Tuesday, 5 June 2012
SSD vs Spinning Disk Benchmark With Bonnie
Tim Bray published the results of running Bonnie, a filesystem benchmark, against an SSD and a spinning disk both mounted on a MacBook Pro:
Now keep in mind that this is not a benchmark for raw speed, but rather a comparison of the file system and bus and disk access.
Original title and link: SSD vs Spinning Disk Benchmark With Bonnie (©myNoSQL)
via: http://www.tbray.org/ongoing/When/201x/2012/06/04/Maxed-Book
Wednesday, 18 January 2012
Amazon’s DynamoDB Shows Hardware as Means to an End... Actually It's All About Predictability
Derrick Harris:
In that sense, DynamoDB is something of a curveball. It lets AWS users leverage the performance of SSDs, only as the underpinning of a new service rather than as a new IaaS feature alone.
[…]
Web developers use NoSQL databases more frequently than enterprise developers, and NoSQL requires solid-state performance.
I think Derrick got this mostly wrong this time. Developers do not care about SSDs per se. What good developers care about is performance. And great developers care about predictability of performance.
There are a couple of NoSQL databases that know this very well. To give you just a couple of examples, take a look at this benchmark of Riak and see what is it focusing on. Or check Riak’s Bitcask backend—here’s also a great explanation of the Bitcask paper—which guarantees a single disk seek per read. I assume you guessed the keyword behind both of these: predictability.
Amazon DynamoDB is using SSDs because:
- it wants to offer predictable low latency
- it wants to offer predictable throughput
- it wants to offer single-digit millisecond average service-side responses
- and it wants to do all these at any scale of dataset sizes and request rates
Hardware is a means to an end. And SSD or not, the aboves are all that matter[1].
-
There are other dimensions of systems that are as critical as the ones covered (e.g. availability, fault-tolerance, etc.), but these are less related to the SSD vs spinning-disks discussion. ↩
Original title and link: Amazon’s DynamoDB Shows Hardware as Means to an End… Actually It’s All About Predictability (©myNoSQL)
via: http://gigaom.com/cloud/amazons-dynamodb-shows-hardware-as-mean-to-an-end/
Wednesday, 21 December 2011
CouchDB's File Format Is Brilliantly Simple and Speed-Efficient at the Cost of Disk Space
Riyad Kalla:
I have been reading up on log structured file systems, efficient data formats, database storage engines and copy-on-write semantics for a little more than week now… reading about the pros and cons of different approaches and seeing it all come together so smoothly in a single design like Couch’s really deserves a hat-tip to the Couch team.
Great post looking at the pros of CouchDB storage format and the tradeoffs the team made on the way.
Original title and link: CouchDB’s File Format Is Brilliantly Simple and Speed-Efficient at the Cost of Disk Space (©myNoSQL)
via: https://plus.google.com/u/0/107397941677313236670/posts/CyvwRcvh4vv
Friday, 25 November 2011
De-Confusing SSD
Great post by Pythian’s Gwen Shapira explaining different aspects of SSD:
SSD is fast for reads, but not for writes. Its fast for random writes, but not for sequential writes. You shouldn’t use it for redo, except that Oracle do that on their appliances. SSD gets slow over time. SSD has a limited lifespan and is unreliable. Performance depends on exactly which SSD you use. You can have PCI or SATA or even SAN. You can use SSD for flash cache, but only specific versions, maybe. You can have MLC or SLC. It can be enterprise or home grade.
Remember that storing your data shouldn’t be a black-and-white decision on using RAM vs SSD vs spinning disks.
Original title and link: De-Confusing SSD (©myNoSQL)
via: http://www.pythian.com/news/28797/de-confusing-ssd-for-oracle-databases/
Most Popular Articles
- Translate SQL to MongoDB MapReduce
- Tutorial: Getting Started With Cassandra
- CouchDB vs MongoDB: An attempt for a More Informed Comparison
- Cassandra @ Twitter: An Interview with Ryan King
- A Couple of Nice GUI Tools for MongoDB
- NoSQL benchmarks and performance evaluations
- Ehcache: Distributed Cache or NoSQL Store?
- Document Databases Compared: CouchDB, MongoDB, RavenDB
- Quick Review of Existing Graph Databases
- NoSQL Data Modeling

