Did you know that Hadoop was the knowledge base behind the Watson supercomputer? I didn’t:
Hadoop was used to create Watson’s “brain,” or the database of knowledge and facilitation of Watson’s processing of enormously large volumes of data in milliseconds. Watson depends on 200 million pages of content and 500 gigabytes of preprocessed information to answer Jeopardy questions. That huge catalog of documents has to be searchable in seconds.
I’d love to read what other open source tools have been used when building Watson. For example has Watson used the Python-based Natural Language Toolkit?
Update: Jeroen Latour points out in a comment a presentation about Watson’s DeepQA Project and an article available in PDF format:
Original title and link: Jeopardy Goes to Hadoop (NoSQL databases © myNoSQL)