ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

Choosing Technologies: The Library of Congress and the Twitter Archive

Remember when everyone was suggesting solutions for Twitter architecture? Now the Library of Congress is trying to figure out what technologies to use to store the Twitter archive:

The project is still very much under construction, and the team is weighing a number of different open source technologies in order to build out the storage, management and querying of the Twitter archive. While the decision hasn’t been made yet on which tools to use, the library is testing the following in various combinations: Hive, ElasticSearch, Pig, Elephant-bird, HBase, and Hadoop.

Note that in terms of storage only HBase is mentioned—Twitter’s main tweet storage is MySQL though.

Original title and link: Choosing Technologies: The Library of Congress and the Twitter Archive (NoSQL database©myNoSQL)

via: http://blogs.forbes.com/oreillymedia/2011/06/13/the-library-of-congress-twitter-archive-one-year-later/