Watson in Space: Advanced Decision Support Systems for NASA exploiting IBM's Deep QA Analytics

Similar documents
IBM Watson : Beyond playing Jeopardy!

WATSON. Michael Dundek Industry Architect. Best Student Recognition Event July 6-8, 2011 EMEA IBM Innovation Center La Gaude, France

MAN VS. MACHINE. How IBM Built a Jeopardy! Champion x The Analytics Edge

Putting IBM Watson to Work In Healthcare

The Future of Business Analytics is Now! 2013 IBM Corporation

What you can accomplish with IBMContent Analytics

Unlocking Big Data: The Power of Cognitive Computing. James Kobielus, IBM

Auto-Classification for Document Archiving and Records Declaration

How Big Data and Artificial Intelligence Change the Game for. presented by Jamie Bisker Senior Analyst, P&C Insurance Aite Group

BMW11: Dealing with the Massive Data Generated by Many-Core Systems. Dr Don Grice IBM Corporation

ICT Perspectives on Big Data: Well Sorted Materials

Data Analytics in Health Care

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

Text Analytics. A business guide

IBM Content Analytics with Enterprise Search, Version 3.0

Fogbeam Vision Series - The Modern Intranet

VIEWPOINT. High Performance Analytics. Industry Context and Trends

Meeting the challenges of today s oil and gas exploration and production industry.

Manjula Ambur NASA Langley Research Center April 2014

Unisys ClearPath Forward Fabric Based Platform to Power the Weather Enterprise

Data Center Fabrics and Their Role in Managing the Big Data Trend

Another Giant Leap. for Mankind. Lesson Development

From Lab to Factory: The Big Data Management Workbook

Decision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010

A Systemic Artificial Intelligence (AI) Approach to Difficult Text Analytics Tasks

Augmented Search for Software Testing

New Broadband and Dynamic Infrastructures for the Internet of the Future

Machine Data Analytics with Sumo Logic

BIG DATA & DATA SCIENCE

Dr. John E. Kelly III Senior Vice President, Director of Research. Differentiating IBM: Research

Cognitive z. Mathew Thoennes IBM Research System z Research June 13, 2016

Datalogix. Using IBM Netezza data warehouse appliances to drive online sales with offline data. Overview. IBM Software Information Management

Knowledge Discovery from patents using KMX Text Analytics

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

Navigating Big Data business analytics

Managing Variability in Software Architectures 1 Felix Bachmann*

Harnessing the power of advanced analytics with IBM Netezza

Multichannel Customer Listening and Social Media Analytics

Turnkey Hardware, Software and Cash Flow / Operational Analytics Framework

Luncheon Webinar Series May 13, 2013

Big Data & Analytics for Semiconductor Manufacturing

ENHANCING INTELLIGENCE SUCCESS: DATA CHARACTERIZATION Francine Forney, Senior Management Consultant, Fuel Consulting, LLC May 2013

PALANTIR CYBER An End-to-End Cyber Intelligence Platform for Analysis & Knowledge Management

» A Hardware & Software Overview. Eli M. Dow <emdow@us.ibm.com:>

Experience studies data management How to generate valuable analytics with improved data processes

Augmented Search for IT Data Analytics. New frontier in big log data analysis and application intelligence

Five Best Practices for Maximizing Big Data ROI

IBM Announces Eight Universities Contributing to the Watson Computing System's Development

IBM Big Data in Government

How To Handle Big Data With A Data Scientist

Building Confidence in Big Data Innovations in Information Integration & Governance for Big Data

Delivering Smart Answers!

Big Data and Healthcare Payers WHITE PAPER

Integrating a Big Data Platform into Government:

Text Mining - Scope and Applications

Software Certification and Software Certificate Management Systems

Boarding to Big data

Von Social Media zum Social Business Ein Megatrend für die Geschäftswelt

HiTech. White Paper. A Next Generation Search System for Today's Digital Enterprises

One thing everyone seems to agree with is that Big Data reflects the geometric growth of captured data and our intent to take advantage of it.

Storage Validation at GE

ETPL Extract, Transform, Predict and Load

A Hurwitz white paper. Inventing the Future. Judith Hurwitz President and CEO. Sponsored by Hitachi

Augmented Search for Web Applications. New frontier in big log data analysis and application intelligence

locuz.com Big Data Services

BIG DATA AND THE ENTERPRISE DATA WAREHOUSE WORKSHOP

Data Discovery, Analytics, and the Enterprise Data Hub

Niara Security Analytics. Overview. Automatically detect attacks on the inside using machine learning

Information Visualization WS 2013/14 11 Visual Analytics

International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: Vol. 1, Issue 6, October Big Data and Hadoop

In-Database Analytics

IBM Content Analytics adds value to Cognos BI

Hadoop for Enterprises:

SIMPLE MACHINE HEURISTIC INTELLIGENT AGENT FRAMEWORK

Who needs humans to run computers? Role of Big Data and Analytics in running Tomorrow s Computers illustrated with Today s Examples

Data warehouse and Business Intelligence Collateral

Databricks. A Primer

Overcoming the Technical and Policy Constraints That Limit Large-Scale Data Integration

Saturn V Straw Rocket

Social Media Implementations

Data Centric Systems (DCS)

Microsoft Big Data Solutions. Anar Taghiyev P-TSP

Watson. An analytical computing system that specializes in natural human language and provides specific answers to complex questions at rapid speeds

Machina Research. Where is the value in IoT? IoT data and analytics may have an answer. Emil Berthelsen, Principal Analyst April 28, 2016

can you effectively plan for the migration and management of systems and applications on Vblock Platforms?

Sentiment Analysis on Big Data

SIEM 2.0: AN IANS INTERACTIVE PHONE CONFERENCE INTEGRATING FIVE KEY REQUIREMENTS MISSING IN 1ST GEN SOLUTIONS SUMMARY OF FINDINGS

A Visualization is Worth a Thousand Tables: How IBM Business Analytics Lets Users See Big Data

UNIVERSITY OF INFINITE AMBITIONS. MASTER OF SCIENCE COMPUTER SCIENCE DATA SCIENCE AND SMART SERVICES

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

ANALYTICS STRATEGY: creating a roadmap for success

How to Run a Successful Big Data POC in 6 Weeks

Big Data Integration: A Buyer's Guide

Mobile Real-Time Bidding and Predictive

BBBT Podcast Transcript

Interoperability, Standards and Open Advancement

Master big data to optimize the oil and gas lifecycle

The EMSX Platform. A Modular, Scalable, Efficient, Adaptable Platform to Manage Multi-technology Networks. A White Paper.

A Characterization Taxonomy for Integrated Management of Modeling and Simulation Tools

Transcription:

Watson in Space: Advanced Decision Support Systems for NASA exploiting IBM's Deep QA Analytics Paul Giangarra IBM Distinguished Engineer This work has been done in collaboration with Doug Stanley (NIA Vice-President of Research and Program Development)

Agenda Motivation and Definitions Systems of Systems FDIR (and ADIR) Decision Support Systems (structured and unstructured) Building on Record, Retrieve, Analyze Pattern COTS based plug and play LCS IBM s Watson technologies Bringing it all together to build Advanced Decision Support Systems 2

Systems of Systems Most new spacecraft are Systems of Systems Often the systems are built by different vendors Integration is usually done by the prime vendor (or NASA) Assertion A large unsolved problem is how to do complete Anomaly/Failure Isolation in large complex Systems of Systems 3

A Short History Diversion On July 20, 1969, Neil Armstrong and Buzz Aldrin had entered the Lunar Module they named 'Eagle' and were descending to the surface. They were about 6,000 feet above the surface and the descent engine was halfway through its final 12 minute burn that would land them safely on the moon, when a yellow caution light lit up on the computer control panel. It was a 1202 error, indicating a memory overload, and the astronauts asked Mission Control for instructions. M.I.T. engineer, George Silver, who was usually at the office at Cape Kennedy. George had been involved in and witnessed many pre flight tests. I asked him in frustration if he had ever seen the Apollo Guidance Computer run slowly and under what conditions. To my surprise and rather matter of fact, he said he had. He called it "cycle stealing" and he said it can occur when the I/O system keeps looking for data. He had seen it when the Rendezvous Radar Switch was on (in the AUTO position) and the computer was looking for radar data. He asked "the Switch isn't on, is it?" "Why would it be on for Descent, it's meant for Ascent?" 4

Background: Operations and Why the Problem Gets Harder Mission Duration: Near Earth Objects Cruise time: 90 days Mission durations: ~ 6 months Communication delay: minutes Communications blackouts: zero (assuming you pay for DSN coverage) Mission Duration: Mars Cruise time: 9 months Mission durations: ~ 3 years Communications delay: 6 minutes to 50 minutes Communications blackouts: weeks at a time every 780 days From the NASA Exploration Technology Development Program Automation for Operations (A4O) Transition Review 5

Background: Operations Mission Operations State of practice : Many tools, tools often modified, new tools must be added, lack of tool interoperability Need: Flexible, evolvable and sustainable mission operations tools Crewed Spacecraft Operations State of practice : Crew relies on ground to support and control operations Time delays reduce crew flexibility and efficiency Need: Crews able to operate systems more independently Uncrewed Spacecraft Operations State of practice: Requires direct human command and monitoring Time delays reduce flexibility and efficiency, large staff requirements Need: Safe, efficient and effective uncrewed operations From the NASA Exploration Technology Development Program Automation for Operations (A4O) Transition Review

The Core Pattern: A/F DIR (Anomaly/Failure Detection, Isolation, Recovery) Fully autonomous A/FDIR is at the core of fully autonomous operations A/F Detection is fairly well understood and generally performed by computers A/F Isolation is still not fully autonomous, there are many cases where we fall back to documentation, human memory, and other unstructured information A/F Recovery is only attainable after A/F Isolation is correct When we can fully automate A/FDIR we will have built HAL 7

A/FDIR Detection Houston we have a problem how often have we heard that? Detection can be at the sub system level or system of systems level Detection can be performed by: Hardware Software Programming (totally usually C/C++, Java, and assembler) Complex Event Detection via rules based Middleware (used for detecting problems with state and often over a long duration ) Information streaming and analysis platforms (used for detecting problems with little state and one or more high volume input streams Critical aspect of A/FD: Detection involves determining when something needs to be done, not what to do 8

A/FDIR Isolation Isolation is the toughest problem When humans are in the loop they need decision support Without humans computers have to attempt to do it alone Isolation can be at the sub system level or system of systems level however Often a problem with a subsystem is not indicative of the full problem Recovery actions must understand the effect on the system of systems Can often have significant time constraints Can involve understanding both structured and unstructured information 9

A/FDIR Isolation (2) Isolation can be performed by: Hardware (e.g. PLMs) Software Programming (totally usually C/C++, Java, and assembler) Rules based Middleware (used for decision support, single state in, single response out, however it can involve a hierarchical decision process) When all else fails RTxM (Read The [pick your word] Manual) Critical aspect of A/FD: Detection involves determining what to do based on facts collected by A/FD and other data available to the Decision Support System 10

A/FDIR Recovery The ultimate goal, recover from the A/F detected, gets more complex when humans are involved When humans are in the loop they need decision support systems to provide good advice for recovery alternatives, but recovery is often executed by the humans Without humans involved computers have to attempt to do it alone Computers may also need to attempt recovery when they are confident they know the solution (what) and there is not enough time for humans to respond (e.g. time critical problems) Recovery can be at the sub system level or system of systems level however Recovery actions must understand the effect on the system of systems Can often have significant time constraints Can be pre substantiated by techniques such as predictive analytics 11

A/FDIR Technological View A/F Detection tells you when you need to do something Technologies involved include streaming analytics, event correlation, complex event processing, record and retrieve A/F Isolation tells you what happened Technologies involved include finite state machines, rules based decision making tools, descriptive analytics, Deep QA with Natural Language Processing A/F Recovery tells you what to do next Technologies involved include predictive and prescriptive analytics Detect Decide Act 12

Advanced Decision Support Systems Involve Operational Decision Management Focuses on the automation and governance of frequently occurring, repeatable decisions that control critical business systems Analytical Decision Management Focuses on the development and deployment of decision services bringing intelligence and predictive insight into repeatable decisions while maximizing outcomes Enabled by: Business Rule Management with Business Event Processing Decision Management Enhanced by: Predictive Analytics with Optimization Deep QA. Closely integrated with: Analytical Decision Management Business Process Management Closely integrated with: Operational Decision Management Business Intelligence 13

Structured Decision Support Structured Decision Support: Can reliably be fully automated Code, state machine steps Rules Supported by modeling (predictive and prescriptive) Structured decision support can provide recommendations and decisions, as well as providing impact and quality analytics of each in a set of possible courses of action Utilize unstructured decision support for additional information. Possibly compare results of unstructured support to results from structured decision results. Know when unstructured decision support needs to be invoked (when all else fails..)

Unstructured Decision Support Unstructured Decision Support: Can consume, process, and interpret problem descriptions in addition to the structured input available Utilize advanced NLP (Natural Language Processing) techniques to understand unstructured information Will utilize known considerations and conclusions in the decision Can ingest, understand, and exploit more types of data Can quickly and efficiently utilize a large corpus of unstructured information Can analyze the original questions and take the users through a dialogue of follow up questions before providing a final set of suggested answers 1

NASA 21st Century Launch Complex PoC Photos taken by Paul Giangarra

Record, Retrieve, Analyze, & Visualize Pattern Decision Support Systems Reports developed by SMEs Visualization and Drill Down Rules developed by SMEs Analytics Framework Real-time Enterprise Service Bus Inform/Act Data Persistence The Core Pattern Sensors COTS Developed on COTS existing 19

The Value to NASA Computer Scientists can focus on what they do best Architect, install, configure and run the infrastructure Develop the needed missing parts (not available in the COTS components) Other Scientists and Engineers can focus on what they do best (and build the rules and visual components based on the COTS tools provided in the solution) COTS based infrastructure is usually easier to build, run, and maintain, especially if chosen components are designed for industrial strength environments The COTS based products chosen are designed with built in scalability, reliability, business continuity, and more Rules are easier to produce than code, less error prone, and more flexible Faster turnaround of extensions, changes, modifications

An IBM Grand Challenge Build a system that rivals a human s ability to answer questions posed in natural language with speed, accuracy and confidence 28

Grand Challenges Advance the Science of Computing Chess: Deep Blue 1997 Limited number of moves and states Explicit, unambiguous mathematical rules Human Language: Watson 2011 Ambiguous, contextual and implicit Grounded in human understanding Infinite expressions with same meaning 29

Jeopardy! Questions covers a broad range of topics History, literature, politics, arts, science etc Fast responses, with accuracy and confidence Word plays, subtle meaning, ironies, riddles 3

Technical Challenges Massive data volumes and collection rates Stresses scaling limits of current systems ingest, storage Degrades accessibility, awareness, timely use Unstructured nature precludes using traditional data discovery and exploration Adding structure during ingestion at high speed Possibly degraded, obfuscated, fragmented Relevance determination to avoid information overload Method of specifying, defining Improve with experience Analyst and mission are at the center Leverage, amplify human analyst experience, insight Augment or replace those tasks better done algorithmically System as apprentice learns from analyst actions, improves with time

WATSON Technology Three pieces of WATSON technology Natural language processing Assembly of information and making it storable Searching and ranking of results from the data searches Understands the questions, takes the set of information that you ve provided it, and ranks the results according to the problem. It is very specific to a problem set. For this presentation, the problem set is extended operations for space exploration.

Unstructured data is complex Where was Einstein born? Person Born In A. Einstein Ulm One day, from among his city views of Ulm, Otto chose a watercolor to send to Albert Einstein as a remembrance of Einstein s birthplace. Structured Unstructured 33

Some Key Definitions What is Text Analytics? Text Analytics describes a set of linguistic, statistical, and machine learning techniques that allow text to be analyzed and key information extracted for business integration. What is Content Analytics? Content Analytics (Text Analytics + Mining) refers to the text analytics process plus the ability to visually identify and explore trends, patterns, and statistically relevant features found in various types of content spread across various content sources.

Watson Took Content Analytics a Huge Step Forward Content Analytics Provides a robust data ingest, search, and visualization capability Sitting on UIMA will handle any unstructured data for which there are annotators to examine, classify, and extract information Primarily driven by the user making decisions about what to look at, where to go next in the analysis DeepQA the Watson technology Also focuses on unstructured data However, represents a breakthrough in AI technology, by answering very open ended questions Evaluates evidence obtained via text analytics to mimic the human thought process

How Does IBM Content Analytics Work? Based on Unstructured Information Management Architecture Claimant: Soft Tissue Injury Extracted Concept Person Injury Body Part Location Noun Verb Noun Phrase Prep Phrase John sprained his ankle in the john... Identify Language Tokenization Word Analytics Named Entity Extraction Multi-word Analytics Automatic Classifier Plug-in Custom Analytics Enhanced Metadata Analytics Index Corpus Analyzed Documents with identified concepts Information Sources UIMA Annotators UIMA is an open, industrial-strength, scalable and extensible platform for creating, integrating and deploying unstructured information management solutions from combinations of semantic analysis and search components. Although UIMA originated at IBM, it is now an OASIS industry standard and an Open Source project which is currently incubating at the Apache Software Foundation.

Five Dimensions of Complexity Broad/Open Data Domain Complex Language Accuracy Confidence High Precision High Speed EU, The European Union Each year the EU selects capitals of culture; one of the 2010 cities was this Turkish "meeting place of cultures" Istanbul

What Computers Find Easier (and Hard) ln((12,546,798 * π) ^ 2) / 34,567.46 = 0.00885 Select Payment where Owner= David Jones and Type(Product)= Laptop, Owner David Jones Serial Number 45322190 AK Invoice # Vendor Payment INV10895 MyBuy $104.56 Serial Number Type Invoice # 45322190 AK LapTop INV10895 David Jones David Jones = Dave Jones David Jones 39 IBM Confidential

The Big Idea Evidence Based Reasoning over Natural Language Content Deep Analysis of clues/questions AND content Search for many possible answers based on different interpretations of question Find, analyze and score EVIDENCE from many different sources (not just one document) for each answer using many advanced NLP and reasoning algorithms Combine evidence and compute a confidence value for each possibility using statistical machine learning Ranks based on confidence And for Jeopardy: If top is above a threshold buzz in else keep quiet

Informed Decision Making Decision Maker Has Question Distills to 2 3 Keywords Reads Documents, Finds Answers Finds & Analyzes Evidence Decision Maker Asks NL Question Considers Answer & Evidence Search Engine Finds Documents containing Keywords Delivers Documents based on Popularity Expert Understands Question Produces Possible Answers & Evidence Analyzes Evidence, Computes Confidence Delivers Response, Evidence & Confidence

IBM DeepQA and FDIR: How Watson Helps With Failure Isolation FD(IR) A Failure is Detected, Failure Isolation starts with Structured Methods, When they are unsuccessful isolating the failure, a question is generated and passed to Watson Evidence 1 2 3 Sources 4 Learned Models help combine and weigh the evidence Models Models Analyze Question / Failure Information Generate Hypotheses Score Hypotheses and Evidence Merge & Rank Final Confidence 5 Answer with Confidence Answer Sources

Building an Advance Decision Support System Build on and utilize existing computer based A/F Detection systems Pass all relevant information collected when an Anomaly/Failure is detected to existing and new A/F Isolation Structured Decision Support Systems Use NLP and Deep QA technologies to create a corpus of Knowledge focused on Space Exploration Mission Operations problems Add Unstructured Decision support based on this Corpus when the Structured decisions support needs assistance Build a smaller Unstructured Decision Support system with a smaller (subset) Corpus of Knowledge for crewed vehicles that can deal with questions that need immediate advice in particular situations where the communications latency between the CV and mission control is long

45 IBM and NIA Proprietary Information 2011, 2012 IBM Corporation