Building Watson A Brief Overview of the DeepQA Project

Size: px
Start display at page:

Download "Building Watson A Brief Overview of the DeepQA Project"

Transcription

1 Building Watson A Brief Overview of the DeepQA Project Eddie Epstein, Manager of Scaleout DeepQA IBM Research 1

2 Want to Play Chess or Just Chat? Chess A finite, mathematically well-defined search space Limited number of moves and states All the symbols are completely grounded in the mathematical rules of the game Human Language Words by themselves have no meaning Only grounded in human cognition Words navigate, align and communicate an infinite space of intended meaning Computers can not ground words to human experiences to derive meaning

3 The Jeopardy! Challenge: A compelling and notable way to drive and measure the technology of automatic Answering along 5 Key Dimensions Broad/Open Domain Complex Language High Precision Accurate Confidence High Speed $200 If you're standing, it's the direction you should look to check out the wainscoting. $600 In cell division, mitosis splits the nucleus & cytokinesis splits this liquid cushioning the nucleus $1000 Of the 4 countries in the world that the U.S. does not have diplomatic relations with, the one that s farthest north

4 Broad Domain We do NOT attempt to anticipate all questions and build specialized databases. In a random sample of 20,000 questions we found 2,500 distinct types*. The most frequent occurring <3% of the time. The distribution has a very long tail. And for each these types 1000 s of different things may be asked. Even going for the head of the tail will barely make a dent *13% are non-distinct (e.g., it, this, these or NA) Our Focus is on reusable NLP technology for analyzing volumes of as-is text. Structured sources (DBs and KBs) are used to help interpret the text. 4

5 The Best Human Performance: Our Analysis Reveals the Winner s Cloud Each dot represents an actual historical human Jeopardy! game Winning Human Performance Grand Champion Human Performance 2007 QA Computer System 5 More Confident Less Confident

6 Not Just for Fun A long, tiresome speech delivered by a frothy pie topping Category: Edible Rhyme Time 6

7 Not Just for Fun Some s require and Synthesis A long, tiresome speech delivered by a frothy pie topping. Diatribe. Harangue. Meringue. Whipped Cream. Answer: Meringue Harangue Category: Edible Rhyme Time 7

8 Divide and Conquer (Typical in Final Jeopardy!) When "60 Minutes" premiered this man was U.S. president.? 8

9 Divide and Conquer (Typical in Final Jeopardy!) Must identify and solve sub-questions from different sources to answer the top level question Lyndon B Johnson In 1968 this man was U.S. president. When "60 Minutes" premiered this man was U.S. president. 9

10 The Missing Link Shirts, TV remote controls, Telephones

11 The Missing Link Buttons Shirts, TV remote controls, Telephones

12 The Missing Link Buttons Shirts, TV remote controls, Telephones On hearing of the discovery of George Mallory's body, he told reporters he still thinks he was first.

13 The Missing Link Buttons Shirts, TV remote controls, Telephones Mt Everest Edmund Hillary He was first On hearing of the discovery of George Mallory's body, he told reporters he still thinks he was first.

14 Watson s Knowledge for Jeopardy! Watson has analyzed and stored the equivalent of about 1 million books (e.g., encyclopedias, dictionaries, news articles, reference texts, plays, etc) Watson also uses structured sources such as WordNet and DBpedia

15 Automatic Learning From Reading Sentence Parsing Generalization & Statistical Aggregation Volumes of Text Syntactic Frames subject verb object Semantic Frames Inventors patent inventions (.8) Officials Submit Resignations (.7) People earn degrees at schools (0.9) Fluid is a liquid (.6) Liquid is a fluid (.5) Vessels Sink (0.7) People sink 8-balls (0.5) (in pool/0.8)

16 How Watson Processes a IN 1698, THIS COMET DISCOVERER TOOK A SHIP CALLED THE PARAMOUR PINK ON THE FIRST PURELY SCIENTIFIC SEA VOYAGE

17 How Watson Processes a IN 1698, THIS COMET DISCOVERER TOOK A SHIP CALLED THE PARAMOUR PINK ON THE FIRST PURELY SCIENTIFIC SEA VOYAGE Analysis Keywords: 1698, comet, paramour, pink, AnswerType(comet discoverer) Date(1698) Took(discoverer, ship) Called(ship, Paramour Pink)

18 How Watson Processes a IN 1698, THIS COMET DISCOVERER TOOK A SHIP CALLED THE PARAMOUR PINK ON THE FIRST PURELY SCIENTIFIC SEA VOYAGE Analysis Keywords: 1698, comet, paramour, pink, AnswerType(comet discoverer) Date(1698) Took(discoverer, ship) Called(ship, Paramour Pink) Primary Search Primary Search Results

19 How Watson Processes a IN 1698, THIS COMET DISCOVERER TOOK A SHIP CALLED THE PARAMOUR PINK ON THE FIRST PURELY SCIENTIFIC SEA VOYAGE Analysis Keywords: 1698, comet, paramour, pink, AnswerType(comet discoverer) Date(1698) Took(discoverer, ship) Called(ship, Paramour Pink) Primary Search Primary Search Results Candidate Answer Generation Isaac Newton Wilhelm Tempel HMS Paramour Christiaan Huygens Halley s Comet Edmond Halley Pink Panther Peter Sellers

20 How Watson Processes a IN 1698, THIS COMET DISCOVERER TOOK A SHIP CALLED THE PARAMOUR PINK ON THE FIRST PURELY SCIENTIFIC SEA VOYAGE Analysis Keywords: 1698, comet, paramour, pink, AnswerType(comet discoverer) Date(1698) Took(discoverer, ship) Called(ship, Paramour Pink) Primary Search Primary Search Results Candidate Answer Generation Isaac Newton Wilhelm Tempel HMS Paramour Christiaan Huygens Halley s Comet Edmond Halley Pink Panther Peter Sellers Evidence Retrieval

21 How Watson Processes a IN 1698, THIS COMET DISCOVERER TOOK A SHIP CALLED THE PARAMOUR PINK ON THE FIRST PURELY SCIENTIFIC SEA VOYAGE Analysis Keywords: 1698, comet, paramour, pink, AnswerType(comet discoverer) Date(1698) Took(discoverer, ship) Called(ship, Paramour Pink) Primary Search Primary Search Results Candidate Answer Generation Isaac Newton Wilhelm Tempel HMS Paramour Christiaan Huygens Lexical Taxonomic Spatial [ ] [ ] [ ] [ ] Temporal Halley s Comet [ ] Edmond Halley [ ] Pink Panther [ ] Peter Sellers Evidence Retrieval [ ] Evidence Scoring

22 How Watson Processes a IN 1698, THIS COMET DISCOVERER TOOK A SHIP CALLED THE PARAMOUR PINK ON THE FIRST PURELY SCIENTIFIC SEA VOYAGE Analysis Keywords: 1698, comet, paramour, pink, AnswerType(comet discoverer) Date(1698) Took(discoverer, ship) Called(ship, Paramour Pink) Primary Search Primary Search Results Candidate Answer Generation Isaac Newton Wilhelm Tempel HMS Paramour Christiaan Huygens Lexical Taxonomic Spatial [ ] [ ] [ ] [ ] Temporal Calculate Rank & Confidence Halley s Comet [ ] Edmond Halley Pink Panther Peter Sellers Evidence Retrieval [ ] [ ] [ ] Evidence Scoring 1) Edmond Halley (0.85) 2) Christiaan Huygens (0.20) 3) Peter Sellers (0.05)

23 Evidence Profiles: Time, Popularity, Source, etc. Clue: You ll find Bethel College and a Seminary in this holy Minnesota city. Positive Evidence Negative Evidence

24 Opportunities for Parallel Processing Primary Searches are independent, and each Search Result is analyzed independently to generate candidate answers Candidate Generation Isaac Newton Primary Search Results Wilhelm Tempel Christiaan Huygens Wilhelm Tempel Edmond Halley Halley s Comet Edmond Halley HMS Paramour Pink Panther Peter Sellers

25 Opportunities for Parallel Processing Evidence is Retrieved for each Candidate Answer independently Evidence scoring is done independently Evidence Retrieval Evidence Scoring [ ] Christiaan Huygens Edmond Halley [ ] [ ] [ ] 1) Edmond Halley (0.85) 2) Christiaan Huygens (0.20) 3) Peter Sellers (0.05)

26 DeepQA: The Technology Behind Watson Massively Parallel Probabilistic Evidence-Based Architecture Case/ / Topic Data /Case Analysis Multiple Interpretations 100s sources Fact, Finding and 100s Possible Answers Hypothesis Generation 1000 s of Pieces of Evidence Hypothesis and Evidence Scoring 100,000 s scores from many simultaneous Text Analysis Algorithms Synthesis Final Confidence Merging & Ranking Hypothesis Generation Hypothesis and Evidence Scoring... Confidence- Weighted Answers

27 DeepQA: The Technology Behind Watson Massively Parallel Probabilistic Evidence-Based Architecture Case/ / Topic Data /Case Analysis Multiple Interpretations 100s sources Fact, Finding and 100s Possible Answers Hypothesis Generation 1000 s of Pieces of Evidence Hypothesis and Evidence Scoring 100,000 s scores from many simultaneous Text Analysis Algorithms Synthesis Final Confidence Merging & Ranking Hypothesis Generation Hypothesis and Evidence Scoring... Confidence- Weighted Answers

28 DeepQA: The Technology Behind Watson Massively Parallel Probabilistic Evidence-Based Architecture Case/ / Topic Data /Case Analysis Multiple Interpretations 100s sources Fact, Finding and 100s Possible Answers Hypothesis Generation 1000 s of Pieces of Evidence Hypothesis and Evidence Scoring 100,000 s scores from many simultaneous Text Analysis Algorithms Synthesis Final Confidence Merging & Ranking Hypothesis Generation Hypothesis and Evidence Scoring... Confidence- Weighted Answers

29 DeepQA on Apache UIMA Aggregate Analysis Engine Analysis Engine Analysis Engine Analysis Engine Analysis Engine Primary Candidate Answer Analysis Searches Generation Scoring Analysis Engine Supporting Analysis Engine Deep Evidence Analysis Engine Final Evidence Search Scoring Merger Flow Controller

30 Multipliers used to Organize Analysis Data Aggregate Analysis Engine Analysis Engine Analysis Engine Analysis Engine Analysis Engine Primary Candidate Answer Analysis Searches Generation Scoring Analysis Engine Analysis Engine Analysis Engine Supporting Deep Evidence Final Flow Controller Evidence Search Scoring Merger

31 An Aggregate AE in Core UIMA is Single Threaded Aggregate Analysis Engine Analysis Engine 1 Annotator Analysis Engine Annotator Flow Controller

32 UIMA-AS Enables Multithreaded Operation Async Aggregate Analysis Engine 5 Input Queue 4 In Queue Flow Controller Analysis Engine Annotator In Queue Analysis Engine Analysis Engine Analysis Engine Annotator

33 Selected Watson Processes Distributed Search Daemons Evidence Evidence Control Evidence Primary Control Searches Control 1..N Evidence Evidence Control Evidence Supporting Control Evidence Control Search Hypothesis Hypothesis Hypothesis Hypothesis Generation Hypothesis Generation Hypothesis Generation Hypothesis Generation Hypothesis Generation Hypothesis Generation Candidate Generation Generation Hypothesis Answer Generation Generation Generation Scoring Deep Deep Evidence Retrieval Deep Evidence Evidence Retrieval and Deep Retrieval and Scoring Deep Evidence and Scoring Deep Evidence Deep Scoring Evidence Deep Scoring Evidence Deep Scoring Evidence Scoring Evidence Scoring Scoring Evidence Evidence Control Evidence Supporting Control Evidence Control Scaleout Top level Aggregate & Topic Analysis Scaleout Control Final Confidence Merging & Ranking Answer & Confidence

34 Watson Inter-Process Data Flow Hypothesis Hypothesis Generation Hypothesis Generation Hypothesis Generation Hypothesis Generation Candidate Generation Generation Hypothesis Hypothesis Generation Hypothesis Generation Hypothesis Generation Generation Hypothesis Answer Generation Scoring Deep Deep Evidence Retrieval Deep Evidence Evidence Retrieval and Deep Retrieval and Scoring Deep Evidence and Scoring Deep Evidence Deep Scoring Evidence Deep Scoring Evidence Deep Scoring Evidence Scoring Evidence Scoring Scoring Evidence Evidence Control Evidence Supporting Control Evidence Control Search JMS Broker Primary Primary Search Sup Evid Search Scaleout Distributed Search Daemons Top Level Aggregate Content & Prismatic Servers Broker-based transactions Indri transactions CS & Prismatic transactions

35 Development System Timing Before Scale Out Single-threaded computation Search indexes on disk Remote Sesame server

36 After first 8 months of Scaleout Work Move everything into RAM Scale out components with UIMA-AS Distribute search

37 12 more months of Scaleout Work Pre-compute deep NLP analysis of entire text corpus Hammer on every computation outlier Expand cluster

38 Technology Classics The Great Outdoors Speak of the Dickens Mind Your Manners Before and After $200 $200 $200 $200 $200 $200 $400 $400 $400 $400 $400 $400 $600 $600 $600 $600 $600 $600 $800 $800 $800 $800 $800 $800 $1000 $1000 $1000 $1000 $1000 $1000 Jeopardy! Game Control System Real-Time Game Configuration Used in Sparring and Exhibition Games Human Player 1 Human Player 2 Strategy & Text-to-Speech Watson s Game Controller Clue & Category Answers & Confidences Watson s QA Engine 400 Processes 90 IBM Power 750s 15 TB RAM 2 TB Disk Clues, Scores & Other Game Data Analysis of natural language content equivalent to 1 Million Books

39 Some Growing Pains (early answers) THE AMERICAN DREAM Decades before Lincoln, Daniel Webster spoke of government "made for", "made by" & "answerable to" them

40 Some Growing Pains (early answers) THE AMERICAN DREAM Decades before Lincoln, Daniel Webster spoke of government "made for", "made by" & "answerable to" them No One

41 Some Growing Pains (early answers) the People THE AMERICAN DREAM Decades before Lincoln, Daniel Webster spoke of government "made for", "made by" & "answerable to" them No One NEW YORK TIMES HEADLINES An exclamation point was warranted for the "end of" this! In 1918

42 Some Growing Pains (early answers) the People THE AMERICAN DREAM Decades before Lincoln, Daniel Webster spoke of government "made for", "made by" & "answerable to" them No One NEW YORK TIMES HEADLINES An exclamation point was warranted for the "end of" this! In 1918 a sentence

43 Some Growing Pains (early answers) the People THE AMERICAN DREAM Decades before Lincoln, Daniel Webster spoke of government "made for", "made by" & "answerable to" them No One WW I NEW YORK TIMES HEADLINES An exclamation point was warranted for the "end of" this! In 1918 a sentence MILESTONES In 1994, 25 years after this event, 1 participant said, "For one crowning moment, we were creatures of the cosmic ocean

44 Some Growing Pains (early answers) the People THE AMERICAN DREAM Decades before Lincoln, Daniel Webster spoke of government "made for", "made by" & "answerable to" them No One WW I NEW YORK TIMES HEADLINES An exclamation point was warranted for the "end of" this! In 1918 a sentence MILESTONES In 1994, 25 years after this event, 1 participant said, "For one crowning moment, we were creatures of the cosmic ocean the Big Bang

45 Some Growing Pains (early answers) the People THE AMERICAN DREAM Decades before Lincoln, Daniel Webster spoke of government "made for", "made by" & "answerable to" them No One WW I NEW YORK TIMES HEADLINES An exclamation point was warranted for the "end of" this! In 1918 a sentence Apollo 11 moon landing MILESTONES In 1994, 25 years after this event, 1 participant said, "For one crowning moment, we were creatures of the cosmic ocean the Big Bang THE QUEEN'S ENGLISH Give a Brit a tinkle when you get into town & you've done this

46 Some Growing Pains (early answers) the People THE AMERICAN DREAM Decades before Lincoln, Daniel Webster spoke of government "made for", "made by" & "answerable to" them No One WW I NEW YORK TIMES HEADLINES An exclamation point was warranted for the "end of" this! In 1918 a sentence Apollo 11 moon landing MILESTONES In 1994, 25 years after this event, 1 participant said, "For one crowning moment, we were creatures of the cosmic ocean the Big Bang THE QUEEN'S ENGLISH Give a Brit a tinkle when you get into town & you've done this urinate

47 Some Growing Pains (early answers) the People THE AMERICAN DREAM Decades before Lincoln, Daniel Webster spoke of government "made for", "made by" & "answerable to" them No One WW I NEW YORK TIMES HEADLINES An exclamation point was warranted for the "end of" this! In 1918 a sentence Apollo 11 moon landing MILESTONES In 1994, 25 years after this event, 1 participant said, "For one crowning moment, we were creatures of the cosmic ocean the Big Bang THE QUEEN'S ENGLISH Give a Brit a tinkle when you get into town & you've done this Call on the phone urinate FATHERLY NICKNAMES This Frenchman was "The Father of Bacteriology"

48 Some Growing Pains (early answers) the People THE AMERICAN DREAM Decades before Lincoln, Daniel Webster spoke of government "made for", "made by" & "answerable to" them No One WW I NEW YORK TIMES HEADLINES An exclamation point was warranted for the "end of" this! In 1918 a sentence Apollo 11 moon landing MILESTONES In 1994, 25 years after this event, 1 participant said, "For one crowning moment, we were creatures of the cosmic ocean the Big Bang THE QUEEN'S ENGLISH Give a Brit a tinkle when you get into town & you've done this Call on the phone urinate FATHERLY NICKNAMES This Frenchman was "The Father of Bacteriology" How Tasty Was My Little Frenchman

49 Some Growing Pains (early answers) the People THE AMERICAN DREAM Decades before Lincoln, Daniel Webster spoke of government "made for", "made by" & "answerable to" them No One WW I NEW YORK TIMES HEADLINES An exclamation point was warranted for the "end of" this! In 1918 a sentence Apollo 11 moon landing MILESTONES In 1994, 25 years after this event, 1 participant said, "For one crowning moment, we were creatures of the cosmic ocean the Big Bang THE QUEEN'S ENGLISH Give a Brit a tinkle when you get into town & you've done this Call on the phone urinate Louis Pasteur FATHERLY NICKNAMES This Frenchman was "The Father of Bacteriology" How Tasty Was My Little Frenchman

50 In-Category Learning CELEBRATIONS OF THE MONTH What the Jeopardy! Clue is asking for is NOT always obvious Watson can try to infer the type of thing being asked for from the previous answers. In this example after seeing 2 correct answers Watson starts to dynamically learn a confidence that the question is asking for something that it can classify as a month. Clue Type Watson s Answer Correct Answer D-DAY ANNIVERSARY & MAGNA CARTA DAY NATIONAL PHILANTHROPY DAY & ALL SOULS' DAY NATIONAL TEACHER DAY & KENTUCKY DERBY DAY ADMINISTRATIVE PROFESSIONALS DAY & NATIONAL CPAS GOOF-OFF DAY NATIONAL MAGIC DAY & NEVADA ADMISSION DAY day Runnymede June day Day/month(.2) Day of the Dead Churchill Downs November May day / month(.6) April April day / month(.8) October October

51 DeepQA: Incremental Progress in Answering Precision on the Jeopardy Challenge: 6/ /2010 IBM Watson Playing in the Winners Cloud v0.8 11/10 V0.7 04/10 v0.6 10/09 v0.5 05/09 v0.4 12/08 v0.3 08/08 v0.2 05/08 v0.1 12/07 Baseline 12/06

52 Watson Workload Optimized System (Power 750) 90 x IBM Power 750 servers 32 POWER GHz cores each 10 Gb Ethernet network 15 Terabytes of total memory 2 Terabyte filesystem for Jeopardy! application Linux provides a cost-effective open platform which is performance-optimized to exploit POWER 7 systems 10 racks include servers, networking, shared disk system, cluster controllers

53 Precision, Confidence & Speed Deep Analytics Combining many analytics in a novel architecture, we achieved very high levels of Precision and Confidence over a huge variety of as-is content. Speed By optimizing Watson s computation for Jeopardy! on over 2,800 POWER7 processing cores we went from hours per question on a single CPU to an average of under 3 seconds. Results in 2010 Watson played 55 real-time sparring games against former Tournament of Champion Players, playing competitively in all games -- and placing 1 st in 71% of them!

54 Potential Business Applications Healthcare / Life Sciences: Diagnostic Assistance, Evidenced- Based, Collaborative Medicine Tech Support: Help-desk, Contact Centers Enterprise Knowledge Management and Business Intelligence Government: Improved Information Sharing and Security

55 Evidence Profiles from disparate data is a powerful idea Each dimension contributes to supporting or refuting hypotheses based on Strength of evidence Importance of dimension for diagnosis (learned from training data) Evidence dimensions are combined to produce an overall confidence Overall Confidence Positive Evidence Negative Evidence

56 The Core Technical Team* Researchers and Engineers in NLP, ML, IR, KR&R and CL at IBM Labs and a growing number of universities Siddarth Patwardhan There is a broader team that contributed to delivering Watson for the Stage, to compete in Jeopardy Games 56 IBM Confidential *NOT full-time Equivalents. Names listed if contributed some time to that part of project.

57 THANK YOU

IBM Watson : Beyond playing Jeopardy!

IBM Watson : Beyond playing Jeopardy! IBM Watson : Beyond playing Jeopardy! Katharine Frase, VP Industries Research, IBM with thanks to: David Ferrucci, Principal Investigator, DeepQA Team @ IBM Research April 24, 2012 Want to Play Chess or

More information

What is Watson An Overview

What is Watson An Overview What is Watson An Overview Michael Perrone Mgr., Multicore Computing IBM TJ Watson Research Center With special thanks to the IBM Research DeepQA Team! Informed Decision Making: Search vs. Expert Q&A Decision

More information

WATSON. Michael Dundek Industry Architect. Best Student Recognition Event July 6-8, 2011 EMEA IBM Innovation Center La Gaude, France

WATSON. Michael Dundek Industry Architect. Best Student Recognition Event July 6-8, 2011 EMEA IBM Innovation Center La Gaude, France WATSON Michael Dundek Industry Architect Best Student Recognition Event July 6-8, 2011 EMEA IBM Innovation Center La Gaude, France Want to Play Chess or Just Chat? Chess A finite, mathematically well-defined

More information

Course Information. CSE 392, Computers Playing Jeopardy!, Fall 2011. http://www.cs.stonybrook.edu/~cse392

Course Information. CSE 392, Computers Playing Jeopardy!, Fall 2011. http://www.cs.stonybrook.edu/~cse392 CSE392 - Computers playing Jeopardy! Course Information CSE 392, Computers Playing Jeopardy!, Fall 2011 Stony Brook University http://www.cs.stonybrook.edu/~cse392 Course Description IBM Watson is a computer

More information

Unlocking Big Data: The Power of Cognitive Computing. James Kobielus, IBM

Unlocking Big Data: The Power of Cognitive Computing. James Kobielus, IBM Unlocking Big Data: The Power of Cognitive Computing James Kobielus, IBM James Kobielus IBM's big data evangelist IBM senior program director for product marketing in big data analytics Editor-in-chief

More information

Watson. An analytical computing system that specializes in natural human language and provides specific answers to complex questions at rapid speeds

Watson. An analytical computing system that specializes in natural human language and provides specific answers to complex questions at rapid speeds Watson An analytical computing system that specializes in natural human language and provides specific answers to complex questions at rapid speeds I.B.M. OHJ-2556 Artificial Intelligence Guest lecturing

More information

MAN VS. MACHINE. How IBM Built a Jeopardy! Champion. 15.071x The Analytics Edge

MAN VS. MACHINE. How IBM Built a Jeopardy! Champion. 15.071x The Analytics Edge MAN VS. MACHINE How IBM Built a Jeopardy! Champion 15.071x The Analytics Edge A Grand Challenge In 2004, IBM Vice President Charles Lickel and coworkers were having dinner at a restaurant All of a sudden,

More information

» A Hardware & Software Overview. Eli M. Dow <emdow@us.ibm.com:>

» A Hardware & Software Overview. Eli M. Dow <emdow@us.ibm.com:> » A Hardware & Software Overview Eli M. Dow Overview:» Hardware» Software» Questions 2011 IBM Corporation Early implementations of Watson ran on a single processor where it took 2 hours

More information

Putting IBM Watson to Work In Healthcare

Putting IBM Watson to Work In Healthcare Martin S. Kohn, MD, MS, FACEP, FACPE Chief Medical Scientist, Care Delivery Systems IBM Research marty.kohn@us.ibm.com Putting IBM Watson to Work In Healthcare 2 SB 1275 Medical data in an electronic or

More information

Andre Standback. IT 103, Sec. 001 2/21/12. IBM s Watson. GMU Honor Code on http://academicintegrity.gmu.edu/honorcode/. I am fully aware of the

Andre Standback. IT 103, Sec. 001 2/21/12. IBM s Watson. GMU Honor Code on http://academicintegrity.gmu.edu/honorcode/. I am fully aware of the Andre Standback IT 103, Sec. 001 2/21/12 IBM s Watson "By placing this statement on my webpage, I certify that I have read and understand the GMU Honor Code on http://academicintegrity.gmu.edu/honorcode/.

More information

BMW11: Dealing with the Massive Data Generated by Many-Core Systems. Dr Don Grice. 2011 IBM Corporation

BMW11: Dealing with the Massive Data Generated by Many-Core Systems. Dr Don Grice. 2011 IBM Corporation BMW11: Dealing with the Massive Data Generated by Many-Core Systems Dr Don Grice IBM Systems and Technology Group Title: Dealing with the Massive Data Generated by Many Core Systems. Abstract: Multi-core

More information

MapReduce and Hadoop Distributed File System

MapReduce and Hadoop Distributed File System MapReduce and Hadoop Distributed File System 1 B. RAMAMURTHY Contact: Dr. Bina Ramamurthy CSE Department University at Buffalo (SUNY) bina@buffalo.edu http://www.cse.buffalo.edu/faculty/bina Partially

More information

Paul J. Ledak Vice President, IBM Research. 2011 IBM Corporation

Paul J. Ledak Vice President, IBM Research. 2011 IBM Corporation Paul J. Ledak Vice President, IBM Research 2011 IBM Corporation Watson answers a grand challenge Can we design a computing system that rivals a human s ability to answer questions posed in natural language,

More information

How Big Data and Artificial Intelligence Change the Game for. presented by Jamie Bisker Senior Analyst, P&C Insurance Aite Group

How Big Data and Artificial Intelligence Change the Game for. presented by Jamie Bisker Senior Analyst, P&C Insurance Aite Group How Big Data and Artificial Intelligence Change the Game for Insurance Professionals presented by Jamie Bisker Senior Analyst, P&C Insurance Aite Group Innovation Provocateur November 2014 Agenda Opening

More information

IBM's Watson could usher in new era of ALS research and medicine http://www.ibm.com/smarterplanet/us/en/healthcare_soluti ons/ideas/index.html?

IBM's Watson could usher in new era of ALS research and medicine http://www.ibm.com/smarterplanet/us/en/healthcare_soluti ons/ideas/index.html? IBM's Watson could usher in new era of ALS research and medicine http://www.ibm.com/smarterplanet/us/en/healthcare_soluti ons/ideas/index.html?re=cs1 By Sharon Gaudin February 17, 2011 06:00 AM ET IBM's

More information

MapReduce and Hadoop Distributed File System V I J A Y R A O

MapReduce and Hadoop Distributed File System V I J A Y R A O MapReduce and Hadoop Distributed File System 1 V I J A Y R A O The Context: Big-data Man on the moon with 32KB (1969); my laptop had 2GB RAM (2009) Google collects 270PB data in a month (2007), 20000PB

More information

What is Artificial Intelligence?

What is Artificial Intelligence? CSE 3401: Intro to Artificial Intelligence & Logic Programming Introduction Required Readings: Russell & Norvig Chapters 1 & 2. Lecture slides adapted from those of Fahiem Bacchus. 1 What is AI? What is

More information

Web Server Software Architectures

Web Server Software Architectures Web Server Software Architectures Author: Daniel A. Menascé Presenter: Noshaba Bakht Web Site performance and scalability 1.workload characteristics. 2.security mechanisms. 3. Web cluster architectures.

More information

Putting Analytics to Work In Healthcare

Putting Analytics to Work In Healthcare Martin S. Kohn, MD, MS, FACEP, FACPE Chief Medical Scientist, Care Delivery Systems IBM Research marty.kohn@us.ibm.com Putting Analytics to Work In Healthcare 2012 International Business Machines Corporation

More information

CSC384 Intro to Artificial Intelligence

CSC384 Intro to Artificial Intelligence CSC384 Intro to Artificial Intelligence What is Artificial Intelligence? What is Intelligence? Are these Intelligent? CSC384, University of Toronto 3 What is Intelligence? Webster says: The capacity to

More information

A Strategic Approach to Unlock the Opportunities from Big Data

A Strategic Approach to Unlock the Opportunities from Big Data A Strategic Approach to Unlock the Opportunities from Big Data Yue Pan, Chief Scientist for Information Management and Healthcare IBM Research - China [contacts: panyue@cn.ibm.com ] Big Data or Big Illusion?

More information

IBM Announces Eight Universities Contributing to the Watson Computing System's Development

IBM Announces Eight Universities Contributing to the Watson Computing System's Development IBM Announces Eight Universities Contributing to the Watson Computing System's Development Press release Related XML feeds Contact(s) information Related resources ARMONK, N.Y. - 11 Feb 2011: IBM (NYSE:

More information

Dr. John E. Kelly III Senior Vice President, Director of Research. Differentiating IBM: Research

Dr. John E. Kelly III Senior Vice President, Director of Research. Differentiating IBM: Research Dr. John E. Kelly III Senior Vice President, Director of Research Differentiating IBM: Research IBM Research Priorities Impact on IBM and the Marketplace Globalization and Leverage Balanced Research Agenda

More information

Interoperability, Standards and Open Advancement

Interoperability, Standards and Open Advancement Interoperability, Standards and Open Eric Nyberg 1 Open Shared resources & annotation schemas Shared component APIs Shared datasets (corpora, test sets) Shared software (open source) Shared configurations

More information

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first

More information

Big Data, Analytics, Intelligence: Potenziale und Nutzen

Big Data, Analytics, Intelligence: Potenziale und Nutzen Dr. Matthias Kaiserswerth Vice President, Europe and Director, IBM Research Big Data, Analytics, Intelligence: Potenziale und Nutzen Market Forces Driving Health Care Transformation Source: If applicable,

More information

Global Technology Outlook 2011

Global Technology Outlook 2011 Global Technology Outlook 2011 Global Technology Outlook 2011 Since 1982, The Global Technology Outlook had identified significant technology trends five to even 10 years before they have come to realization.

More information

PLA 7 WAYS TO USE LOG DATA FOR PROACTIVE PERFORMANCE MONITORING. [ WhitePaper ]

PLA 7 WAYS TO USE LOG DATA FOR PROACTIVE PERFORMANCE MONITORING. [ WhitePaper ] [ WhitePaper ] PLA 7 WAYS TO USE LOG DATA FOR PROACTIVE PERFORMANCE MONITORING. Over the past decade, the value of log data for monitoring and diagnosing complex networks has become increasingly obvious.

More information

The Big Data Revolution: welcome to the Cognitive Era.

The Big Data Revolution: welcome to the Cognitive Era. The Big Data Revolution: welcome to the Cognitive Era. Yves Eychenne, Cloud Advisor, IBM Email: yves.eychenne@fr.ibm.com @yeychenne 2015 INTERNATIONAL BUSINESS MACHINES CORPORATION Agenda Big Data and

More information

A Distributed Render Farm System for Animation Production

A Distributed Render Farm System for Animation Production A Distributed Render Farm System for Animation Production Jiali Yao, Zhigeng Pan *, Hongxin Zhang State Key Lab of CAD&CG, Zhejiang University, Hangzhou, 310058, China {yaojiali, zgpan, zhx}@cad.zju.edu.cn

More information

The 5 New Realities of Server Monitoring

The 5 New Realities of Server Monitoring Uptime Infrastructure Monitor Whitepaper The 5 New Realities of Server Monitoring How to Maximize Virtual Performance, Availability & Capacity Cost Effectively. Server monitoring has never been more critical.

More information

A Novel Cloud Based Elastic Framework for Big Data Preprocessing

A Novel Cloud Based Elastic Framework for Big Data Preprocessing School of Systems Engineering A Novel Cloud Based Elastic Framework for Big Data Preprocessing Omer Dawelbeit and Rachel McCrindle October 21, 2014 University of Reading 2008 www.reading.ac.uk Overview

More information

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS! The Bloor Group IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS VENDOR PROFILE The IBM Big Data Landscape IBM can legitimately claim to have been involved in Big Data and to have a much broader

More information

Software design ideas for SoLID

Software design ideas for SoLID Software design ideas for SoLID Ole Hansen Jefferson Lab EIC Software Meeting Jefferson Lab September 25, 2015 Ole Hansen (Jefferson Lab) Software design ideas for SoLID Sept 25, 2015 1 / 10 The SoLID

More information

Scheduling and Resource Management in Computational Mini-Grids

Scheduling and Resource Management in Computational Mini-Grids Scheduling and Resource Management in Computational Mini-Grids July 1, 2002 Project Description The concept of grid computing is becoming a more and more important one in the high performance computing

More information

Scaling Up Low Latency, Virtualization, Windows and Linux for Wall Street. Peter ffoulkes VP of Marketing, Adaptive Computing

Scaling Up Low Latency, Virtualization, Windows and Linux for Wall Street. Peter ffoulkes VP of Marketing, Adaptive Computing Scaling Up Low Latency, Virtualization, and for Wall Street Peter ffoulkes VP of Marketing, Adaptive Computing Mixed Environments Add Complexity Technology Change Security Compounded by crossplatform challenges

More information

Evaluation of Enterprise Data Protection using SEP Software

Evaluation of Enterprise Data Protection using SEP Software Test Validation Test Validation - SEP sesam Enterprise Backup Software Evaluation of Enterprise Data Protection using SEP Software Author:... Enabling you to make the best technology decisions Backup &

More information

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All

More information

Natural Language to Relational Query by Using Parsing Compiler

Natural Language to Relational Query by Using Parsing Compiler Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,

More information

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software WHITEPAPER Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software SanDisk ZetaScale software unlocks the full benefits of flash for In-Memory Compute and NoSQL applications

More information

Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering

Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays Red Hat Performance Engineering Version 1.0 August 2013 1801 Varsity Drive Raleigh NC

More information

IBM Netezza High Capacity Appliance

IBM Netezza High Capacity Appliance IBM Netezza High Capacity Appliance Petascale Data Archival, Analysis and Disaster Recovery Solutions IBM Netezza High Capacity Appliance Highlights: Allows querying and analysis of deep archival data

More information

Big Data Challenges. technology basics for data scientists. Spring - 2014. Jordi Torres, UPC - BSC www.jorditorres.

Big Data Challenges. technology basics for data scientists. Spring - 2014. Jordi Torres, UPC - BSC www.jorditorres. Big Data Challenges technology basics for data scientists Spring - 2014 Jordi Torres, UPC - BSC www.jorditorres.eu @JordiTorresBCN Data Deluge: Due to the changes in big data generation Example: Biomedicine

More information

IBM Centennial Getting ready for a Smarter Planet & Big Data

IBM Centennial Getting ready for a Smarter Planet & Big Data IBM Centennial Getting ready for a Smarter Planet & Big Data First Byte Symposium 1st Anniversary of LSDF for Life Sciences at Bioquant Heidelberg 26 Mai 2011 Dieter Münk Vice President IBM WW Storage

More information

Scalable Cloud Computing Solutions for Next Generation Sequencing Data

Scalable Cloud Computing Solutions for Next Generation Sequencing Data Scalable Cloud Computing Solutions for Next Generation Sequencing Data Matti Niemenmaa 1, Aleksi Kallio 2, André Schumacher 1, Petri Klemelä 2, Eija Korpelainen 2, and Keijo Heljanko 1 1 Department of

More information

Sense Making in an IOT World: Sensor Data Analysis with Deep Learning

Sense Making in an IOT World: Sensor Data Analysis with Deep Learning Sense Making in an IOT World: Sensor Data Analysis with Deep Learning Natalia Vassilieva, PhD Senior Research Manager GTC 2016 Deep learning proof points as of today Vision Speech Text Other Search & information

More information

Auto-Classification for Document Archiving and Records Declaration

Auto-Classification for Document Archiving and Records Declaration Auto-Classification for Document Archiving and Records Declaration Josemina Magdalen, Architect, IBM November 15, 2013 Agenda IBM / ECM/ Content Classification for Document Archiving and Records Management

More information

CSCI 101 - Historical Development. May 29, 2015

CSCI 101 - Historical Development. May 29, 2015 CSCI 101 - Historical Development May 29, 2015 Historical Development 1. IBM 2. Commodore 3. Unix and Linux 4. Raspberry pi IBM IBM or International Business Machine Corporation began in the late 1800's,

More information

Big Data Challenges in Bioinformatics

Big Data Challenges in Bioinformatics Big Data Challenges in Bioinformatics BARCELONA SUPERCOMPUTING CENTER COMPUTER SCIENCE DEPARTMENT Autonomic Systems and ebusiness Pla?orms Jordi Torres Jordi.Torres@bsc.es Talk outline! We talk about Petabyte?

More information

High Performance Computing and Big Data: The coming wave.

High Performance Computing and Big Data: The coming wave. High Performance Computing and Big Data: The coming wave. 1 In science and engineering, in order to compete, you must compute Today, the toughest challenges, and greatest opportunities, require computation

More information

This presentation provides an overview of the architecture of the IBM Workload Deployer product.

This presentation provides an overview of the architecture of the IBM Workload Deployer product. This presentation provides an overview of the architecture of the IBM Workload Deployer product. Page 1 of 17 This presentation starts with an overview of the appliance components and then provides more

More information

Graph Database Performance: An Oracle Perspective

Graph Database Performance: An Oracle Perspective Graph Database Performance: An Oracle Perspective Xavier Lopez, Ph.D. Senior Director, Product Management 1 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Program Agenda Broad Perspective

More information

Colgate-Palmolive selects SAP HANA to improve the speed of business analytics with IBM and SAP

Colgate-Palmolive selects SAP HANA to improve the speed of business analytics with IBM and SAP selects SAP HANA to improve the speed of business analytics with IBM and SAP Founded in 1806, is a global consumer products company which sells nearly $17 billion annually in personal care, home care,

More information

An Alternative Storage Solution for MapReduce. Eric Lomascolo Director, Solutions Marketing

An Alternative Storage Solution for MapReduce. Eric Lomascolo Director, Solutions Marketing An Alternative Storage Solution for MapReduce Eric Lomascolo Director, Solutions Marketing MapReduce Breaks the Problem Down Data Analysis Distributes processing work (Map) across compute nodes and accumulates

More information

Logging for Cloud Computing Forensic Systems

Logging for Cloud Computing Forensic Systems INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL ISSN 1841-9836, 10(2):222-229, April, 2015. Logging for Cloud Computing Forensic Systems A. Pătraşcu, V.V. Patriciu Alecsandru Pătraşcu* 1. Military

More information

Data management challenges in todays Healthcare and Life Sciences ecosystems

Data management challenges in todays Healthcare and Life Sciences ecosystems Data management challenges in todays Healthcare and Life Sciences ecosystems Jose L. Alvarez Principal Engineer, WW Director Life Sciences jose.alvarez@seagate.com Evolution of Data Sets in Healthcare

More information

Getting to Know Big Data

Getting to Know Big Data Getting to Know Big Data Dr. Putchong Uthayopas Department of Computer Engineering, Faculty of Engineering, Kasetsart University Email: putchong@ku.th Information Tsunami Rapid expansion of Smartphone

More information

TeachingEnglish Lesson plans. Conversation Lesson News. Topic: News

TeachingEnglish Lesson plans. Conversation Lesson News. Topic: News Conversation Lesson News Topic: News Aims: - To develop fluency through a range of speaking activities - To introduce related vocabulary Level: Intermediate (can be adapted in either direction) Introduction

More information

Collaborations between Official Statistics and Academia in the Era of Big Data

Collaborations between Official Statistics and Academia in the Era of Big Data Collaborations between Official Statistics and Academia in the Era of Big Data World Statistics Day October 20-21, 2015 Budapest Vijay Nair University of Michigan Past-President of ISI vnn@umich.edu What

More information

Microsoft SQL Server 2008 R2 Enterprise Edition and Microsoft SharePoint Server 2010

Microsoft SQL Server 2008 R2 Enterprise Edition and Microsoft SharePoint Server 2010 Microsoft SQL Server 2008 R2 Enterprise Edition and Microsoft SharePoint Server 2010 Better Together Writer: Bill Baer, Technical Product Manager, SharePoint Product Group Technical Reviewers: Steve Peschka,

More information

Operational Insights for. Running IT at the Speed of Business

Operational Insights for. Running IT at the Speed of Business Operational Insights for Running IT at the Speed of Business This paper is sponsored by IBM Corporation Operational Insights for Running IT at the Speed of Business Page 2 Table of Contents Introduction...

More information

Promises and Pitfalls of Big-Data-Predictive Analytics: Best Practices and Trends

Promises and Pitfalls of Big-Data-Predictive Analytics: Best Practices and Trends Promises and Pitfalls of Big-Data-Predictive Analytics: Best Practices and Trends Spring 2015 Thomas Hill, Ph.D. VP Analytic Solutions Dell Statistica Overview and Agenda Dell Software overview Dell in

More information

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next

More information

Analysis of Cloud Solutions for Asset Management

Analysis of Cloud Solutions for Asset Management ICT Innovations 2010 Web Proceedings ISSN 1857-7288 345 Analysis of Cloud Solutions for Asset Management Goran Kolevski, Marjan Gusev Institute of Informatics, Faculty of Natural Sciences and Mathematics,

More information

The goals of IBM Research are to advance computer science

The goals of IBM Research are to advance computer science Building Watson: An Overview of the DeepQA Project David Ferrucci, Eric Brown, Jennifer Chu-Carroll, James Fan, David Gondek, Aditya A. Kalyanpur, Adam Lally, J. William Murdock, Eric Nyberg, John Prager,

More information

PARALLELS CLOUD STORAGE

PARALLELS CLOUD STORAGE PARALLELS CLOUD STORAGE Performance Benchmark Results 1 Table of Contents Executive Summary... Error! Bookmark not defined. Architecture Overview... 3 Key Features... 5 No Special Hardware Requirements...

More information

GeoGrid Project and Experiences with Hadoop

GeoGrid Project and Experiences with Hadoop GeoGrid Project and Experiences with Hadoop Gong Zhang and Ling Liu Distributed Data Intensive Systems Lab (DiSL) Center for Experimental Computer Systems Research (CERCS) Georgia Institute of Technology

More information

PUSD High Frequency Word List

PUSD High Frequency Word List PUSD High Frequency Word List For Reading and Spelling Grades K-5 High Frequency or instant words are important because: 1. You can t read a sentence or a paragraph without knowing at least the most common.

More information

HADOOP ON ORACLE ZFS STORAGE A TECHNICAL OVERVIEW

HADOOP ON ORACLE ZFS STORAGE A TECHNICAL OVERVIEW HADOOP ON ORACLE ZFS STORAGE A TECHNICAL OVERVIEW 757 Maleta Lane, Suite 201 Castle Rock, CO 80108 Brett Weninger, Managing Director brett.weninger@adurant.com Dave Smelker, Managing Principal dave.smelker@adurant.com

More information

Integrating Apache Spark with an Enterprise Data Warehouse

Integrating Apache Spark with an Enterprise Data Warehouse Integrating Apache Spark with an Enterprise Warehouse Dr. Michael Wurst, IBM Corporation Architect Spark/R/Python base Integration, In-base Analytics Dr. Toni Bollinger, IBM Corporation Senior Software

More information

Introduction to Big Data! with Apache Spark" UC#BERKELEY#

Introduction to Big Data! with Apache Spark UC#BERKELEY# Introduction to Big Data! with Apache Spark" UC#BERKELEY# So What is Data Science?" Doing Data Science" Data Preparation" Roles" This Lecture" What is Data Science?" Data Science aims to derive knowledge!

More information

Making Watson fast. E. A. Epstein M. I. Schor B. S. Iyer A. Lally E. W. Brown J. Cwiklik

Making Watson fast. E. A. Epstein M. I. Schor B. S. Iyer A. Lally E. W. Brown J. Cwiklik Making Watson fast IBM Watsoni is a system created to demonstrate DeepQA technology by competing against human champions in a question-answering game designed for people. The DeepQA architecture was designed

More information

Teaching Analy-cs, Big Data and Sustainability: An IS perspec-ve

Teaching Analy-cs, Big Data and Sustainability: An IS perspec-ve Teaching Analy-cs, Big Data and Sustainability: An IS perspec-ve Raja Sooriamurthi / Randy Weinberg Informa(on Systems Program Carnegie Mellon University {raja,rweinberg}@cmu.edu Presenta-on Outline The

More information

SEAIP 2009 Presentation

SEAIP 2009 Presentation SEAIP 2009 Presentation By David Tan Chair of Yahoo! Hadoop SIG, 2008-2009,Singapore EXCO Member of SGF SIG Imperial College (UK), Institute of Fluid Science (Japan) & Chicago BOOTH GSB (USA) Alumni Email:

More information

Big Data: Study in Structured and Unstructured Data

Big Data: Study in Structured and Unstructured Data Big Data: Study in Structured and Unstructured Data Motashim Rasool 1, Wasim Khan 2 mail2motashim@gmail.com, khanwasim051@gmail.com Abstract With the overlay of digital world, Information is available

More information

Topology Aware Analytics for Elastic Cloud Services

Topology Aware Analytics for Elastic Cloud Services Topology Aware Analytics for Elastic Cloud Services athafoud@cs.ucy.ac.cy Master Thesis Presentation May 28 th 2015, Department of Computer Science, University of Cyprus In Brief.. a Tool providing Performance

More information

Complexity and Scalability in Semantic Graph Analysis Semantic Days 2013

Complexity and Scalability in Semantic Graph Analysis Semantic Days 2013 Complexity and Scalability in Semantic Graph Analysis Semantic Days 2013 James Maltby, Ph.D 1 Outline of Presentation Semantic Graph Analytics Database Architectures In-memory Semantic Database Formulation

More information

Increase Agility and Reduce Costs with a Logical Data Warehouse. February 2014

Increase Agility and Reduce Costs with a Logical Data Warehouse. February 2014 Increase Agility and Reduce Costs with a Logical Data Warehouse February 2014 Table of Contents Summary... 3 Data Virtualization & the Logical Data Warehouse... 4 What is a Logical Data Warehouse?... 4

More information

Oracle Big Data SQL Technical Update

Oracle Big Data SQL Technical Update Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical

More information

Cray: Enabling Real-Time Discovery in Big Data

Cray: Enabling Real-Time Discovery in Big Data Cray: Enabling Real-Time Discovery in Big Data Discovery is the process of gaining valuable insights into the world around us by recognizing previously unknown relationships between occurrences, objects

More information

WHITE PAPER TOPIC DATE Enabling MaaS Open Data Agile Design and Deployment with CA ERwin. Nuccio Piscopo. agility made possible

WHITE PAPER TOPIC DATE Enabling MaaS Open Data Agile Design and Deployment with CA ERwin. Nuccio Piscopo. agility made possible WHITE PAPER TOPIC DATE Enabling MaaS Open Data Agile Design and Deployment with CA ERwin Nuccio Piscopo agility made possible Table of Contents Introduction 3 MaaS enables Agile Open Data Design 4 MaaS

More information

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage White Paper Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage A Benchmark Report August 211 Background Objectivity/DB uses a powerful distributed processing architecture to manage

More information

Depth-of-Knowledge Levels for Four Content Areas Norman L. Webb March 28, 2002. Reading (based on Wixson, 1999)

Depth-of-Knowledge Levels for Four Content Areas Norman L. Webb March 28, 2002. Reading (based on Wixson, 1999) Depth-of-Knowledge Levels for Four Content Areas Norman L. Webb March 28, 2002 Language Arts Levels of Depth of Knowledge Interpreting and assigning depth-of-knowledge levels to both objectives within

More information

Grid Scheduling Dictionary of Terms and Keywords

Grid Scheduling Dictionary of Terms and Keywords Grid Scheduling Dictionary Working Group M. Roehrig, Sandia National Laboratories W. Ziegler, Fraunhofer-Institute for Algorithms and Scientific Computing Document: Category: Informational June 2002 Status

More information

The Doctoral Degree in Social Work

The Doctoral Degree in Social Work University Press Scholarship Online You are looking at 1-10 of 22 items for: keywords : decision analysis The Doctoral Degree in Social Work Peter Lyons and Howard J. Doueck in The Dissertation: From Beginning

More information

Search and Information Retrieval

Search and Information Retrieval Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search

More information

From Petabytes (10 15 ) to Yottabytes (10 24 ), & Before 2020! Steve Barber, CEO June 27, 2012

From Petabytes (10 15 ) to Yottabytes (10 24 ), & Before 2020! Steve Barber, CEO June 27, 2012 From Petabytes (10 15 ) to Yottabytes (10 24 ), & Before 2020! Steve Barber, CEO June 27, 2012 Safe Harbor Statement Forward Looking Statements The following information contains, or may be deemed to contain,

More information

How To Use Hadoop For Gis

How To Use Hadoop For Gis 2013 Esri International User Conference July 8 12, 2013 San Diego, California Technical Workshop Big Data: Using ArcGIS with Apache Hadoop David Kaiser Erik Hoel Offering 1330 Esri UC2013. Technical Workshop.

More information

A stream computing approach towards scalable NLP

A stream computing approach towards scalable NLP A stream computing approach towards scalable NLP Xabier Artola, Zuhaitz Beloki, Aitor Soroa IXA group. University of the Basque Country. LREC, Reykjavík 2014 Table of contents 1

More information

2009 Oracle Corporation 1

2009 Oracle Corporation 1 The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material,

More information

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale WHITE PAPER Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale Sponsored by: IBM Carl W. Olofson December 2014 IN THIS WHITE PAPER This white paper discusses the concept

More information

Performance rule violations usually result in increased CPU or I/O, time to fix the mistake, and ultimately, a cost to the business unit.

Performance rule violations usually result in increased CPU or I/O, time to fix the mistake, and ultimately, a cost to the business unit. Is your database application experiencing poor response time, scalability problems, and too many deadlocks or poor application performance? One or a combination of zparms, database design and application

More information

Who needs humans to run computers? Role of Big Data and Analytics in running Tomorrow s Computers illustrated with Today s Examples

Who needs humans to run computers? Role of Big Data and Analytics in running Tomorrow s Computers illustrated with Today s Examples 15 April 2015, COST ACROSS Workshop, Würzburg Who needs humans to run computers? Role of Big Data and Analytics in running Tomorrow s Computers illustrated with Today s Examples Maris van Sprang, 2015

More information

Getting started with a data quality program

Getting started with a data quality program IBM Software White Paper Information Management Getting started with a data quality program 2 Getting started with a data quality program The data quality challenge Organizations depend on quality data

More information

Product Version 1.0 Document Version 1.0-B

Product Version 1.0 Document Version 1.0-B VidyoDashboard Installation Guide Product Version 1.0 Document Version 1.0-B Table of Contents 1. Overview... 3 About This Guide... 3 Prerequisites... 3 2. Installing VidyoDashboard... 5 Installing the

More information

DBpedia German: Extensions and Applications

DBpedia German: Extensions and Applications DBpedia German: Extensions and Applications Alexandru-Aurelian Todor FU-Berlin, Innovationsforum Semantic Media Web, 7. Oktober 2014 Overview Why DBpedia? New Developments in DBpedia German Problems in

More information

In-memory databases and innovations in Business Intelligence

In-memory databases and innovations in Business Intelligence Database Systems Journal vol. VI, no. 1/2015 59 In-memory databases and innovations in Business Intelligence Ruxandra BĂBEANU, Marian CIOBANU University of Economic Studies, Bucharest, Romania babeanu.ruxandra@gmail.com,

More information

Statistical Machine Translation: IBM Models 1 and 2

Statistical Machine Translation: IBM Models 1 and 2 Statistical Machine Translation: IBM Models 1 and 2 Michael Collins 1 Introduction The next few lectures of the course will be focused on machine translation, and in particular on statistical machine translation

More information

Search Engine Based Intelligent Help Desk System: iassist

Search Engine Based Intelligent Help Desk System: iassist Search Engine Based Intelligent Help Desk System: iassist Sahil K. Shah, Prof. Sheetal A. Takale Information Technology Department VPCOE, Baramati, Maharashtra, India sahilshahwnr@gmail.com, sheetaltakale@gmail.com

More information

Patent Big Data Analysis by R Data Language for Technology Management

Patent Big Data Analysis by R Data Language for Technology Management , pp. 69-78 http://dx.doi.org/10.14257/ijseia.2016.10.1.08 Patent Big Data Analysis by R Data Language for Technology Management Sunghae Jun * Department of Statistics, Cheongju University, 360-764, Korea

More information