NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



storage: All content tagged as storage in NoSQL databases and polyglot persistence

Backblaze: What hard drive should I buy?

Continuing their amazing series of posts on hard drives — previous one is a must read: How long do disk drives last?, the last post shares some stats about the models and the corresponding failure rates of the hard drives Blackblaze is using:

Because Backblaze has a history of openness, many readers expected more details in my previous posts. They asked what drive models work best and which last the longest. Given our experience with over 25,000 drives, they asked which ones are good enough that we would buy them again. In this post, I’ll answer those questions.

The answer you are looking for is: Hitachi. There are two things to keep in mind though:

  1. these drives are used inside Backblaze’s dedicated storage solutions and are the basis of the backup solution Backblaze’s offering.
  2. Hitachi sold its hard drive business to Western Digital and Toshiba.


✚ As a side note, one of the Backblaze’s favorites is the Western Digital 3TB Red. Recently I’ve got a Synlogy and put 2 x 4TB WD Red’s inside. While not having Backblaze’s data, I’ve based my decision on a combination of reviews and costs.

Original title and link: Backblaze: What hard drive should I buy? (NoSQL database©myNoSQL)


Storage Pod - 180TB and Probably Growing

We thought ten people would care; instead a million people read our Storage Pod 1.0 blog post where we open sourced the Backblaze Storage Pod design and introduced the world’s most cost-efficient way to store big data. […] Today we introduce Backblaze Storage Pod 3.0 which stores more data, costs less, is more reliable, and is easier to service.

Because my knowledge of building hardware stuff has been stable around zero for-ever, I enjoy quite a bit at least reading about it.

the HN thread

Original title and link: Storage Pod - 180TB and Probably Growing (NoSQL database©myNoSQL)


Petabyte Reliable DNA Storage

The abstract of the report “Towards practical, high-capacity, low-maintenance information storage in synthesized DNA“:

This challenge has focused some interest on DNA as an attractive target for information storage because of its capacity for high- density information encoding, longevity under easily achieved conditions and proven track record as an information bearer. […]We encoded computer files totalling 739 kilobytes of hard-disk storage and with an estimated Shannon information10 of 5.2x10^6 bits into a DNA code, synthesized this DNA, sequenced it and reconstructed the original files with 100% accuracy.

The article is behind the paywall, but Gizmodo writes the the results published:

[…] they can store 2.2 petabytes of information in a single gram of DNA, and recover it with 100 percent accuracy.

Original title and link: Petabyte Reliable DNA Storage (NoSQL database©myNoSQL)


The Post-RAID Era Begins

Robin Harris looks into the reasons why RAID is not anymore a viable or at least growing technology and how fountain/rateless erasure codes solutions could replace it delivering better replication than the usually 3x type of replication we see in the NoSQL space.

The post-RAID (noRAID) era has begun. While RAID arrays aren’t going away, the growth is elsewhere, and corporate investment follows growth.

Original title and link: The Post-RAID Era Begins (NoSQL database©myNoSQL)


Possible 100-fold increase in data storage speed

European researchers may have found a way to speed up data storage 100-fold, breaking one barrier holding back how fast data can be transferred. […] The researchers at York University in the U.K. and Nijmegen University in the Netherlands accomplished the feat by heating a magnetic material with laser bursts that alter what is called the magnetic spin of the material at the atomic level, according to an explanation by York University. There are two possible spins, parallel and anti-parallel, and in storage, these binary states would represent the ones and zeros that designate bit types.

I still find the salmon storage more mouthwatering.

Original title and link: Possible 100-fold increase in data storage speed (NoSQL database©myNoSQL)


IDC: Storage Shipments Keep Surging. Where Is the Exponential Growth Though?


Companies are updating their storage systems for the era of “big data,” to deal with huge and growing volumes of information, she said. The total market for disk storage systems grew just over 10 percent from last year’s second quarter to reach almost US$7.5 billion in revenue

Maybe I’m misreading the data, but if we are talking about exponential growth of data (Big Data), where is the exponential growth in storage shipment?

Original title and link: IDC: Storage Shipments Keep Surging. Where Is the Exponential Growth Though? (NoSQL database©myNoSQL)


Petabytes on a Budget V2.0: Backblaze Storage

It’s been over a year since Backblaze revealed the designs of our first generation (67 terabyte) storage pod. During that time, we’ve remained focused on our mission to provide an unlimited online backup service for $5 per month. To maintain profitability, we continue to avoid overpriced commercial solutions, and we now build the Backblaze Storage Pod 2.0: a 135-terabyte, 4U server for $7,384. It’s double the storage and twice the performance—at lower cost than the original.

So Facebook would need around 30 Backblaze servers to store their 30PB of data. That doesn’t sound like much.

Original title and link: Petabytes on a Budget V2.0: Backblaze Storage (NoSQL database©myNoSQL)


Storage Directions in an Era of Big Data

David Floyer (ex IDC analyst) covers in a long article the major forces and trends in the storage industry and the major trends that will define IT development for the coming decade:

The storage infrastructure will allow dynamic transport of data across the network when required, for instance to support business continuity, and with some balancing of workloads. However, data volumes and bandwidth are growing at approximately the same rate, and large-scale movement of data between sites will not be a viable strategy. Instead applications (especially business intelligence and analytics applications) will often be moved to where the data is (the Hadoop model) rather than data being moved to the applications. This will be especially true of “big data” environments, where vast amounts of semi-structure data will be available within the private and public clouds.

David looks at the impact of multi-core processors, flash, virtualization, disk drive technologies on storage and considers three mega-trends in the data wave/BigData:

  1. The simplification of IT infrastructures through convergence,
  2. Massive cost reductions through virtualization,
  3. New business models enabled by cloud computing.

Original title and link: Storage Directions in an Era of Big Data (NoSQL databases © myNoSQL)