Choosing Technologies: The Library of Congress and the Twitter Archive
Remember when everyone was suggesting solutions for Twitter architecture? Now the Library of Congress is trying to figure out what technologies to use to store the Twitter archive:
The project is still very much under construction, and the team is weighing a number of different open source technologies in order to build out the storage, management and querying of the Twitter archive. While the decision hasn’t been made yet on which tools to use, the library is testing the following in various combinations: Hive, ElasticSearch, Pig, Elephant-bird, HBase, and Hadoop.
Note that in terms of storage only HBase is mentioned—Twitter’s main tweet storage is MySQL though.
Original title and link: Choosing Technologies: The Library of Congress and the Twitter Archive (NoSQL database©myNoSQL)