ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

Jeopardy Goes to Hadoop

Did you know that Hadoop was the knowledge base behind the Watson supercomputer? I didn’t:

Hadoop was used to create Watson’s “brain,” or the database of knowledge and facilitation of Watson’s processing of enormously large volumes of data in milliseconds. Watson depends on 200 million pages of content and 500 gigabytes of preprocessed information to answer Jeopardy questions. That huge catalog of documents has to be searchable in seconds.

I’d love to read what other open source tools have been used when building Watson. For example has Watson used the Python-based Natural Language Toolkit?

Update: Jeroen Latour points out in a comment a presentation about Watson’s DeepQA Project and an article available in PDF format:

Original title and link: Jeopardy Goes to Hadoop (NoSQL databases © myNoSQL)

via: http://ycorpblog.com/2011/02/18/jeopardy-hadoop/