This week, IBM’s supercomputer, Watson (named after IBM’s founder, Thomas J. Watson), took on two of the most championed Jeopardy! contestants of all time in an exhilarating $1 million Jeopardy! face-off between man and machine.
Watson defeated Jeopardy! defenders Ken Jennings and Brad Rutter, amassing $77,147 in winnings in a nail-biting three-night tournament that sparked interest around the field of artificial intelligence and data analytics.
IBM explained, that by matching the text in a question to the text in its vast memory, Watson can analyze and recite an accurate answer in less than three seconds. If there is no match in Watson’s “brain,” it takes a guess based on a confidence level that is calculated on probabilities.
So what makes Watson’s genius possible? A whole lot of storage, sophisticated hardware, super fast processors and Apache Hadoop, the open source technology pioneered by Yahoo! and at the epicenter of big data and cloud computing.
Hadoop was used to create Watson’s “brain,” or the database of knowledge and facilitation of Watson’s processing of enormously large volumes of data in milliseconds. Watson depends on 200 million pages of content and 500 gigabytes of preprocessed information to answer Jeopardy questions. That huge catalog of documents has to be searchable in seconds. On a single computer, it would be impossible to do, but by using Hadoop and dividing the work on to many computers it can be done.
In 2005, Yahoo! created Hadoop and since then has been the most active contributor to Apache Hadoop, contributing over 70 percent of the code and running the world’s largest Hadoop implementation, with more than 40,000 servers. As a point of reference, our Hadoop implementation processes 1.5 times the amount of data in the printed collections in the Library of Congress per day, approximately 16 terabytes of data.
We’ve been doing it because we think it’s a game-changer for the Internet. Hadoop is critical to Yahoo!’s business, delivering personalized experiences to our more than 630 million users worldwide. Yahoo! Mail uses Hadoop to fight spam, the Yahoo! home page content is personalized with Hadoop and a suite of our personalization technologies. What these have in common is that they require processing huge amounts of data very quickly and reliably on large numbers of computers, mirroring Waston’s requirements to win Jeopardy! And just like Hadoop was critical to Watson’s success, it has a fundamental and direct impact on Yahoo!’s performance and bottom-line.
So if you’re ever on Jeopardy!...”It is the technology behind every click on Yahoo! and the IBM supercomputer that accomplished the impossible….”
Previous Post
Yahoo! aggregates the billions of searches performed across Yahoo! properties to give the pulse on what people are thinking and talking about. Searches represent the people, an instant poll every moment of the day, looking into the fleeting moods and entrenched attitudes of Internet users across the world. Each week ...
