Building Watson A Brief Overview of the DeepQA Project

Similar documents
IBM Watson : Beyond playing Jeopardy!

What is Watson An Overview

Watson. An analytical computing system that specializes in natural human language and provides specific answers to complex questions at rapid speeds

MAN VS. MACHINE. How IBM Built a Jeopardy! Champion x The Analytics Edge

» A Hardware & Software Overview. Eli M. Dow <emdow@us.ibm.com:>

Putting IBM Watson to Work In Healthcare

Andre Standback. IT 103, Sec /21/12. IBM s Watson. GMU Honor Code on I am fully aware of the

MapReduce and Hadoop Distributed File System

Paul J. Ledak Vice President, IBM Research IBM Corporation

How Big Data and Artificial Intelligence Change the Game for. presented by Jamie Bisker Senior Analyst, P&C Insurance Aite Group

IBM's Watson could usher in new era of ALS research and medicine ons/ideas/index.html?

MapReduce and Hadoop Distributed File System V I J A Y R A O

What is Artificial Intelligence?

Web Server Software Architectures

CSC384 Intro to Artificial Intelligence

A Strategic Approach to Unlock the Opportunities from Big Data

Dr. John E. Kelly III Senior Vice President, Director of Research. Differentiating IBM: Research

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

Big Data, Analytics, Intelligence: Potenziale und Nutzen

PLA 7 WAYS TO USE LOG DATA FOR PROACTIVE PERFORMANCE MONITORING. [ WhitePaper ]

The 5 New Realities of Server Monitoring

A Novel Cloud Based Elastic Framework for Big Data Preprocessing

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

Evaluation of Enterprise Data Protection using SEP Software

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Natural Language to Relational Query by Using Parsing Compiler

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering

IBM Netezza High Capacity Appliance

Big Data Challenges. technology basics for data scientists. Spring Jordi Torres, UPC - BSC

Scalable Cloud Computing Solutions for Next Generation Sequencing Data

Sense Making in an IOT World: Sensor Data Analysis with Deep Learning

Auto-Classification for Document Archiving and Records Declaration

Big Data Challenges in Bioinformatics

High Performance Computing and Big Data: The coming wave.

This presentation provides an overview of the architecture of the IBM Workload Deployer product.

Graph Database Performance: An Oracle Perspective

Colgate-Palmolive selects SAP HANA to improve the speed of business analytics with IBM and SAP

An Alternative Storage Solution for MapReduce. Eric Lomascolo Director, Solutions Marketing

Logging for Cloud Computing Forensic Systems

Data management challenges in todays Healthcare and Life Sciences ecosystems

Getting to Know Big Data

TeachingEnglish Lesson plans. Conversation Lesson News. Topic: News

Collaborations between Official Statistics and Academia in the Era of Big Data

Microsoft SQL Server 2008 R2 Enterprise Edition and Microsoft SharePoint Server 2010

Promises and Pitfalls of Big-Data-Predictive Analytics: Best Practices and Trends

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

Analysis of Cloud Solutions for Asset Management

PARALLELS CLOUD STORAGE

GeoGrid Project and Experiences with Hadoop

PUSD High Frequency Word List

HADOOP ON ORACLE ZFS STORAGE A TECHNICAL OVERVIEW

Integrating Apache Spark with an Enterprise Data Warehouse

Introduction to Big Data! with Apache Spark" UC#BERKELEY#

Big Data: Study in Structured and Unstructured Data

Topology Aware Analytics for Elastic Cloud Services

Complexity and Scalability in Semantic Graph Analysis Semantic Days 2013

Increase Agility and Reduce Costs with a Logical Data Warehouse. February 2014

Oracle Big Data SQL Technical Update

Cray: Enabling Real-Time Discovery in Big Data

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

Depth-of-Knowledge Levels for Four Content Areas Norman L. Webb March 28, Reading (based on Wixson, 1999)

Grid Scheduling Dictionary of Terms and Keywords

Search and Information Retrieval

How To Use Hadoop For Gis

A stream computing approach towards scalable NLP

2009 Oracle Corporation 1

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

Performance rule violations usually result in increased CPU or I/O, time to fix the mistake, and ultimately, a cost to the business unit.

Who needs humans to run computers? Role of Big Data and Analytics in running Tomorrow s Computers illustrated with Today s Examples

Getting started with a data quality program

Product Version 1.0 Document Version 1.0-B

DBpedia German: Extensions and Applications

In-memory databases and innovations in Business Intelligence

Statistical Machine Translation: IBM Models 1 and 2

Search Engine Based Intelligent Help Desk System: iassist

Patent Big Data Analysis by R Data Language for Technology Management

Transcription:

Building Watson A Brief Overview of the DeepQA Project Eddie Epstein, Manager of Scaleout DeepQA Team @ IBM Research 1

Want to Play Chess or Just Chat? Chess A finite, mathematically well-defined search space Limited number of moves and states All the symbols are completely grounded in the mathematical rules of the game Human Language Words by themselves have no meaning Only grounded in human cognition Words navigate, align and communicate an infinite space of intended meaning Computers can not ground words to human experiences to derive meaning

The Jeopardy! Challenge: A compelling and notable way to drive and measure the technology of automatic Answering along 5 Key Dimensions Broad/Open Domain Complex Language High Precision Accurate Confidence High Speed $200 If you're standing, it's the direction you should look to check out the wainscoting. $600 In cell division, mitosis splits the nucleus & cytokinesis splits this liquid cushioning the nucleus $1000 Of the 4 countries in the world that the U.S. does not have diplomatic relations with, the one that s farthest north

Broad Domain We do NOT attempt to anticipate all questions and build specialized databases. In a random sample of 20,000 questions we found 2,500 distinct types*. The most frequent occurring <3% of the time. The distribution has a very long tail. And for each these types 1000 s of different things may be asked. Even going for the head of the tail will barely make a dent *13% are non-distinct (e.g., it, this, these or NA) Our Focus is on reusable NLP technology for analyzing volumes of as-is text. Structured sources (DBs and KBs) are used to help interpret the text. 4

The Best Human Performance: Our Analysis Reveals the Winner s Cloud Each dot represents an actual historical human Jeopardy! game Winning Human Performance Grand Champion Human Performance 2007 QA Computer System 5 More Confident Less Confident

Not Just for Fun A long, tiresome speech delivered by a frothy pie topping Category: Edible Rhyme Time 6

Not Just for Fun Some s require and Synthesis A long, tiresome speech delivered by a frothy pie topping. Diatribe. Harangue. Meringue. Whipped Cream. Answer: Meringue Harangue Category: Edible Rhyme Time 7

Divide and Conquer (Typical in Final Jeopardy!) When "60 Minutes" premiered this man was U.S. president.? 8

Divide and Conquer (Typical in Final Jeopardy!) Must identify and solve sub-questions from different sources to answer the top level question Lyndon B Johnson In 1968 this man was U.S. president. When "60 Minutes" premiered this man was U.S. president. 9

The Missing Link Shirts, TV remote controls, Telephones

The Missing Link Buttons Shirts, TV remote controls, Telephones

The Missing Link Buttons Shirts, TV remote controls, Telephones On hearing of the discovery of George Mallory's body, he told reporters he still thinks he was first.

The Missing Link Buttons Shirts, TV remote controls, Telephones Mt Everest Edmund Hillary He was first On hearing of the discovery of George Mallory's body, he told reporters he still thinks he was first.

Watson s Knowledge for Jeopardy! Watson has analyzed and stored the equivalent of about 1 million books (e.g., encyclopedias, dictionaries, news articles, reference texts, plays, etc) Watson also uses structured sources such as WordNet and DBpedia

Automatic Learning From Reading Sentence Parsing Generalization & Statistical Aggregation Volumes of Text Syntactic Frames subject verb object Semantic Frames Inventors patent inventions (.8) Officials Submit Resignations (.7) People earn degrees at schools (0.9) Fluid is a liquid (.6) Liquid is a fluid (.5) Vessels Sink (0.7) People sink 8-balls (0.5) (in pool/0.8)

How Watson Processes a IN 1698, THIS COMET DISCOVERER TOOK A SHIP CALLED THE PARAMOUR PINK ON THE FIRST PURELY SCIENTIFIC SEA VOYAGE

How Watson Processes a IN 1698, THIS COMET DISCOVERER TOOK A SHIP CALLED THE PARAMOUR PINK ON THE FIRST PURELY SCIENTIFIC SEA VOYAGE Analysis Keywords: 1698, comet, paramour, pink, AnswerType(comet discoverer) Date(1698) Took(discoverer, ship) Called(ship, Paramour Pink)

How Watson Processes a IN 1698, THIS COMET DISCOVERER TOOK A SHIP CALLED THE PARAMOUR PINK ON THE FIRST PURELY SCIENTIFIC SEA VOYAGE Analysis Keywords: 1698, comet, paramour, pink, AnswerType(comet discoverer) Date(1698) Took(discoverer, ship) Called(ship, Paramour Pink) Primary Search Primary Search Results

How Watson Processes a IN 1698, THIS COMET DISCOVERER TOOK A SHIP CALLED THE PARAMOUR PINK ON THE FIRST PURELY SCIENTIFIC SEA VOYAGE Analysis Keywords: 1698, comet, paramour, pink, AnswerType(comet discoverer) Date(1698) Took(discoverer, ship) Called(ship, Paramour Pink) Primary Search Primary Search Results Candidate Answer Generation Isaac Newton Wilhelm Tempel HMS Paramour Christiaan Huygens Halley s Comet Edmond Halley Pink Panther Peter Sellers

How Watson Processes a IN 1698, THIS COMET DISCOVERER TOOK A SHIP CALLED THE PARAMOUR PINK ON THE FIRST PURELY SCIENTIFIC SEA VOYAGE Analysis Keywords: 1698, comet, paramour, pink, AnswerType(comet discoverer) Date(1698) Took(discoverer, ship) Called(ship, Paramour Pink) Primary Search Primary Search Results Candidate Answer Generation Isaac Newton Wilhelm Tempel HMS Paramour Christiaan Huygens Halley s Comet Edmond Halley Pink Panther Peter Sellers Evidence Retrieval

How Watson Processes a IN 1698, THIS COMET DISCOVERER TOOK A SHIP CALLED THE PARAMOUR PINK ON THE FIRST PURELY SCIENTIFIC SEA VOYAGE Analysis Keywords: 1698, comet, paramour, pink, AnswerType(comet discoverer) Date(1698) Took(discoverer, ship) Called(ship, Paramour Pink) Primary Search Primary Search Results Candidate Answer Generation Isaac Newton Wilhelm Tempel HMS Paramour Christiaan Huygens Lexical Taxonomic Spatial [0.58 0-1.3 0.97] [0.71 1 13.4 0.72] [0.12 0 2.0 0.40] [0.84 1 10.6 0.21] Temporal Halley s Comet [0.33 0 6.3 0.83] Edmond Halley [0.21 1 11.1 0.92] Pink Panther [0.91 0-8.2 0.61] Peter Sellers Evidence Retrieval [0.91 0-1.7 0.60] Evidence Scoring

How Watson Processes a IN 1698, THIS COMET DISCOVERER TOOK A SHIP CALLED THE PARAMOUR PINK ON THE FIRST PURELY SCIENTIFIC SEA VOYAGE Analysis Keywords: 1698, comet, paramour, pink, AnswerType(comet discoverer) Date(1698) Took(discoverer, ship) Called(ship, Paramour Pink) Primary Search Primary Search Results Candidate Answer Generation Isaac Newton Wilhelm Tempel HMS Paramour Christiaan Huygens Lexical Taxonomic Spatial [0.58 0-1.3 0.97] [0.71 1 13.4 0.72] [0.12 0 2.0 0.40] [0.84 1 10.6 0.21] Temporal Calculate Rank & Confidence Halley s Comet [0.33 0 6.3 0.83] Edmond Halley Pink Panther Peter Sellers Evidence Retrieval [0.21 1 11.1 0.92] [0.91 0-8.2 0.61] [0.91 0-1.7 0.60] Evidence Scoring 1) Edmond Halley (0.85) 2) Christiaan Huygens (0.20) 3) Peter Sellers (0.05)

Evidence Profiles: Time, Popularity, Source, etc. Clue: You ll find Bethel College and a Seminary in this holy Minnesota city. Positive Evidence Negative Evidence

Opportunities for Parallel Processing Primary Searches are independent, and each Search Result is analyzed independently to generate candidate answers Candidate Generation Isaac Newton Primary Search Results Wilhelm Tempel Christiaan Huygens Wilhelm Tempel Edmond Halley Halley s Comet Edmond Halley HMS Paramour Pink Panther Peter Sellers

Opportunities for Parallel Processing Evidence is Retrieved for each Candidate Answer independently Evidence scoring is done independently Evidence Retrieval Evidence Scoring [0.58 0-1.3 0.97] Christiaan Huygens Edmond Halley [0.71 1 13.4 0.72] [0.12 0 2.0 0.40] [0.84 1 10.6 0.21] 1) Edmond Halley (0.85) 2) Christiaan Huygens (0.20) 3) Peter Sellers (0.05)

DeepQA: The Technology Behind Watson Massively Parallel Probabilistic Evidence-Based Architecture Case/ / Topic Data /Case Analysis Multiple Interpretations 100s sources Fact, Finding and 100s Possible Answers Hypothesis Generation 1000 s of Pieces of Evidence Hypothesis and Evidence Scoring 100,000 s scores from many simultaneous Text Analysis Algorithms Synthesis Final Confidence Merging & Ranking Hypothesis Generation Hypothesis and Evidence Scoring... Confidence- Weighted Answers

DeepQA: The Technology Behind Watson Massively Parallel Probabilistic Evidence-Based Architecture Case/ / Topic Data /Case Analysis Multiple Interpretations 100s sources Fact, Finding and 100s Possible Answers Hypothesis Generation 1000 s of Pieces of Evidence Hypothesis and Evidence Scoring 100,000 s scores from many simultaneous Text Analysis Algorithms Synthesis Final Confidence Merging & Ranking Hypothesis Generation Hypothesis and Evidence Scoring... Confidence- Weighted Answers

DeepQA: The Technology Behind Watson Massively Parallel Probabilistic Evidence-Based Architecture Case/ / Topic Data /Case Analysis Multiple Interpretations 100s sources Fact, Finding and 100s Possible Answers Hypothesis Generation 1000 s of Pieces of Evidence Hypothesis and Evidence Scoring 100,000 s scores from many simultaneous Text Analysis Algorithms Synthesis Final Confidence Merging & Ranking Hypothesis Generation Hypothesis and Evidence Scoring... Confidence- Weighted Answers

DeepQA on Apache UIMA Aggregate Analysis Engine Analysis Engine Analysis Engine Analysis Engine Analysis Engine Primary Candidate Answer Analysis Searches Generation Scoring Analysis Engine Supporting Analysis Engine Deep Evidence Analysis Engine Final Evidence Search Scoring Merger Flow Controller

Multipliers used to Organize Analysis Data Aggregate Analysis Engine Analysis Engine Analysis Engine Analysis Engine Analysis Engine Primary Candidate Answer Analysis Searches Generation Scoring Analysis Engine Analysis Engine Analysis Engine Supporting Deep Evidence Final Flow Controller Evidence Search Scoring Merger

An Aggregate AE in Core UIMA is Single Threaded Aggregate Analysis Engine Analysis Engine 1 Annotator Analysis Engine Annotator Flow Controller

UIMA-AS Enables Multithreaded Operation Async Aggregate Analysis Engine 5 Input Queue 4 In Queue Flow Controller Analysis Engine 1 2 3 Annotator In Queue Analysis Engine Analysis Engine Analysis Engine Annotator

Selected Watson Processes Distributed Search Daemons Evidence Evidence Control Evidence Primary Control Searches Control 1..N Evidence Evidence Control Evidence Supporting Control Evidence Control Search Hypothesis Hypothesis Hypothesis Hypothesis Generation Hypothesis Generation Hypothesis Generation Hypothesis Generation Hypothesis Generation Hypothesis Generation Candidate Generation Generation Hypothesis Answer Generation Generation Generation Scoring Deep Deep Evidence Retrieval Deep Evidence Evidence Retrieval and Deep Retrieval and Scoring Deep Evidence and Scoring Deep Evidence Deep Scoring Evidence Deep Scoring Evidence Deep Scoring Evidence Scoring Evidence Scoring Scoring Evidence Evidence Control Evidence Supporting Control Evidence Control Scaleout Top level Aggregate & Topic Analysis Scaleout Control Final Confidence Merging & Ranking Answer & Confidence

Watson Inter-Process Data Flow Hypothesis Hypothesis Generation Hypothesis Generation Hypothesis Generation Hypothesis Generation Candidate Generation Generation Hypothesis Hypothesis Generation Hypothesis Generation Hypothesis Generation Generation Hypothesis Answer Generation Scoring Deep Deep Evidence Retrieval Deep Evidence Evidence Retrieval and Deep Retrieval and Scoring Deep Evidence and Scoring Deep Evidence Deep Scoring Evidence Deep Scoring Evidence Deep Scoring Evidence Scoring Evidence Scoring Scoring Evidence Evidence Control Evidence Supporting Control Evidence Control Search JMS Broker Primary Primary Search Sup Evid Search Scaleout Distributed Search Daemons Top Level Aggregate Content & Prismatic Servers Broker-based transactions Indri transactions CS & Prismatic transactions

Development System Timing Before Scale Out Single-threaded computation Search indexes on disk Remote Sesame server

After first 8 months of Scaleout Work Move everything into RAM Scale out components with UIMA-AS Distribute search

12 more months of Scaleout Work Pre-compute deep NLP analysis of entire text corpus Hammer on every computation outlier Expand cluster

Technology Classics The Great Outdoors Speak of the Dickens Mind Your Manners Before and After $200 $200 $200 $200 $200 $200 $400 $400 $400 $400 $400 $400 $600 $600 $600 $600 $600 $600 $800 $800 $800 $800 $800 $800 $1000 $1000 $1000 $1000 $1000 $1000 Jeopardy! Game Control System Real-Time Game Configuration Used in Sparring and Exhibition Games Human Player 1 Human Player 2 Strategy & Text-to-Speech Watson s Game Controller Clue & Category Answers & Confidences Watson s QA Engine 400 Processes 90 IBM Power 750s 15 TB RAM 2 TB Disk Clues, Scores & Other Game Data Analysis of natural language content equivalent to 1 Million Books

Some Growing Pains (early answers) THE AMERICAN DREAM Decades before Lincoln, Daniel Webster spoke of government "made for", "made by" & "answerable to" them

Some Growing Pains (early answers) THE AMERICAN DREAM Decades before Lincoln, Daniel Webster spoke of government "made for", "made by" & "answerable to" them No One

Some Growing Pains (early answers) the People THE AMERICAN DREAM Decades before Lincoln, Daniel Webster spoke of government "made for", "made by" & "answerable to" them No One NEW YORK TIMES HEADLINES An exclamation point was warranted for the "end of" this! In 1918

Some Growing Pains (early answers) the People THE AMERICAN DREAM Decades before Lincoln, Daniel Webster spoke of government "made for", "made by" & "answerable to" them No One NEW YORK TIMES HEADLINES An exclamation point was warranted for the "end of" this! In 1918 a sentence

Some Growing Pains (early answers) the People THE AMERICAN DREAM Decades before Lincoln, Daniel Webster spoke of government "made for", "made by" & "answerable to" them No One WW I NEW YORK TIMES HEADLINES An exclamation point was warranted for the "end of" this! In 1918 a sentence MILESTONES In 1994, 25 years after this event, 1 participant said, "For one crowning moment, we were creatures of the cosmic ocean

Some Growing Pains (early answers) the People THE AMERICAN DREAM Decades before Lincoln, Daniel Webster spoke of government "made for", "made by" & "answerable to" them No One WW I NEW YORK TIMES HEADLINES An exclamation point was warranted for the "end of" this! In 1918 a sentence MILESTONES In 1994, 25 years after this event, 1 participant said, "For one crowning moment, we were creatures of the cosmic ocean the Big Bang

Some Growing Pains (early answers) the People THE AMERICAN DREAM Decades before Lincoln, Daniel Webster spoke of government "made for", "made by" & "answerable to" them No One WW I NEW YORK TIMES HEADLINES An exclamation point was warranted for the "end of" this! In 1918 a sentence Apollo 11 moon landing MILESTONES In 1994, 25 years after this event, 1 participant said, "For one crowning moment, we were creatures of the cosmic ocean the Big Bang THE QUEEN'S ENGLISH Give a Brit a tinkle when you get into town & you've done this

Some Growing Pains (early answers) the People THE AMERICAN DREAM Decades before Lincoln, Daniel Webster spoke of government "made for", "made by" & "answerable to" them No One WW I NEW YORK TIMES HEADLINES An exclamation point was warranted for the "end of" this! In 1918 a sentence Apollo 11 moon landing MILESTONES In 1994, 25 years after this event, 1 participant said, "For one crowning moment, we were creatures of the cosmic ocean the Big Bang THE QUEEN'S ENGLISH Give a Brit a tinkle when you get into town & you've done this urinate

Some Growing Pains (early answers) the People THE AMERICAN DREAM Decades before Lincoln, Daniel Webster spoke of government "made for", "made by" & "answerable to" them No One WW I NEW YORK TIMES HEADLINES An exclamation point was warranted for the "end of" this! In 1918 a sentence Apollo 11 moon landing MILESTONES In 1994, 25 years after this event, 1 participant said, "For one crowning moment, we were creatures of the cosmic ocean the Big Bang THE QUEEN'S ENGLISH Give a Brit a tinkle when you get into town & you've done this Call on the phone urinate FATHERLY NICKNAMES This Frenchman was "The Father of Bacteriology"

Some Growing Pains (early answers) the People THE AMERICAN DREAM Decades before Lincoln, Daniel Webster spoke of government "made for", "made by" & "answerable to" them No One WW I NEW YORK TIMES HEADLINES An exclamation point was warranted for the "end of" this! In 1918 a sentence Apollo 11 moon landing MILESTONES In 1994, 25 years after this event, 1 participant said, "For one crowning moment, we were creatures of the cosmic ocean the Big Bang THE QUEEN'S ENGLISH Give a Brit a tinkle when you get into town & you've done this Call on the phone urinate FATHERLY NICKNAMES This Frenchman was "The Father of Bacteriology" How Tasty Was My Little Frenchman

Some Growing Pains (early answers) the People THE AMERICAN DREAM Decades before Lincoln, Daniel Webster spoke of government "made for", "made by" & "answerable to" them No One WW I NEW YORK TIMES HEADLINES An exclamation point was warranted for the "end of" this! In 1918 a sentence Apollo 11 moon landing MILESTONES In 1994, 25 years after this event, 1 participant said, "For one crowning moment, we were creatures of the cosmic ocean the Big Bang THE QUEEN'S ENGLISH Give a Brit a tinkle when you get into town & you've done this Call on the phone urinate Louis Pasteur FATHERLY NICKNAMES This Frenchman was "The Father of Bacteriology" How Tasty Was My Little Frenchman

In-Category Learning CELEBRATIONS OF THE MONTH What the Jeopardy! Clue is asking for is NOT always obvious Watson can try to infer the type of thing being asked for from the previous answers. In this example after seeing 2 correct answers Watson starts to dynamically learn a confidence that the question is asking for something that it can classify as a month. Clue Type Watson s Answer Correct Answer D-DAY ANNIVERSARY & MAGNA CARTA DAY NATIONAL PHILANTHROPY DAY & ALL SOULS' DAY NATIONAL TEACHER DAY & KENTUCKY DERBY DAY ADMINISTRATIVE PROFESSIONALS DAY & NATIONAL CPAS GOOF-OFF DAY NATIONAL MAGIC DAY & NEVADA ADMISSION DAY day Runnymede June day Day/month(.2) Day of the Dead Churchill Downs November May day / month(.6) April April day / month(.8) October October

DeepQA: Incremental Progress in Answering Precision on the Jeopardy Challenge: 6/2007-11/2010 IBM Watson Playing in the Winners Cloud v0.8 11/10 V0.7 04/10 v0.6 10/09 v0.5 05/09 v0.4 12/08 v0.3 08/08 v0.2 05/08 v0.1 12/07 Baseline 12/06

Watson Workload Optimized System (Power 750) 90 x IBM Power 750 servers 32 POWER7 3.55 GHz cores each 10 Gb Ethernet network 15 Terabytes of total memory 2 Terabyte filesystem for Jeopardy! application Linux provides a cost-effective open platform which is performance-optimized to exploit POWER 7 systems 10 racks include servers, networking, shared disk system, cluster controllers

Precision, Confidence & Speed Deep Analytics Combining many analytics in a novel architecture, we achieved very high levels of Precision and Confidence over a huge variety of as-is content. Speed By optimizing Watson s computation for Jeopardy! on over 2,800 POWER7 processing cores we went from hours per question on a single CPU to an average of under 3 seconds. Results in 2010 Watson played 55 real-time sparring games against former Tournament of Champion Players, playing competitively in all games -- and placing 1 st in 71% of them!

Potential Business Applications Healthcare / Life Sciences: Diagnostic Assistance, Evidenced- Based, Collaborative Medicine Tech Support: Help-desk, Contact Centers Enterprise Knowledge Management and Business Intelligence Government: Improved Information Sharing and Security

Evidence Profiles from disparate data is a powerful idea Each dimension contributes to supporting or refuting hypotheses based on Strength of evidence Importance of dimension for diagnosis (learned from training data) Evidence dimensions are combined to produce an overall confidence Overall Confidence Positive Evidence Negative Evidence

The Core Technical Team* Researchers and Engineers in NLP, ML, IR, KR&R and CL at IBM Labs and a growing number of universities Siddarth Patwardhan There is a broader team that contributed to delivering Watson for the Stage, to compete in Jeopardy Games 56 IBM Confidential *NOT full-time Equivalents. Names listed if contributed some time to that part of project.

THANK YOU