Data Search. Searching and Finding information in Unstructured and Structured Data Sources

Similar documents
Enterprise Solutions. Data Warehouse & Business Intelligence Chapter-8

SAS BI Course Content; Introduction to DWH / BI Concepts

2010 Oracle Corporation 1

Business Intelligence In SAP Environments

Bussiness Intelligence and Data Warehouse. Tomas Bartos CIS 764, Kansas State University

Applied Business Intelligence. Iakovos Motakis, Ph.D. Director, DW & Decision Support Systems Intrasoft SA

QlikView Business Discovery Platform. Algol Consulting Srl

Oracle Business Intelligence 11g Business Dashboard Management

Open Source Business Intelligence Intro

Data Warehousing and Data Mining in Business Applications

Why include analytics as part of the School of Information Technology curriculum?

<Insert Picture Here> Extending Hyperion BI with the Oracle BI Server

Data warehouse and Business Intelligence Collateral

Tableau Visual Intelligence Platform Rapid Fire Analytics for Everyone Everywhere

LITERATURE SURVEY ON DATA WAREHOUSE AND ITS TECHNIQUES

IST722 Data Warehousing

Hexaware E-book on Predictive Analytics

BIG DATA COURSE 1 DATA QUALITY STRATEGIES - CUSTOMIZED TRAINING OUTLINE. Prepared by:

Armanino McKenna LLP Welcomes You To Today s Webinar:

Understanding Data Warehousing. [by Alex Kriegel]

Structure of the presentation

Microsoft Data Warehouse in Depth

TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS

Business Intelligence. A Presentation of the Current Lead Solutions and a Comparative Analysis of the Main Providers

SAS Business Intelligence Online Training

Building Cubes and Analyzing Data using Oracle OLAP 11g

Oracle BI Application: Demonstrating the Functionality & Ease of use. Geoffrey Francis Naailah Gora

Designing a Dimensional Model

If you re serious about Business Intelligence, you need a BI Competency Centre

Chapter 6 - Enhancing Business Intelligence Using Information Systems

III JORNADAS DE DATA MINING

Vendor briefing Business Intelligence and Analytics Platforms Gartner 15 capabilities

The Business Value of Predictive Analytics

SAP Business Objects BO BI 4.1

Course 6234A: Implementing and Maintaining Microsoft SQL Server 2008 Analysis Services

Data Warehouses and Business Intelligence ITP 487 (3 Units) Fall Objective

8. Business Intelligence Reference Architectures and Patterns

Practical meta data solutions for the large data warehouse

Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence

SIGNIFICANCE OF BUSINESS INTELLIGENCE APPLICATIONS FOR BETTER DECISION MAKING & BUSINESS PERFORMANCE

Creating an Enterprise Reporting Bus with SAP BusinessObjects

A Model-based Software Architecture for XML Data and Metadata Integration in Data Warehouse Systems

ORACLE BUSINESS INTELLIGENCE SUITE ENTERPRISE EDITION PLUS

The Role of the BI Competency Center in Maximizing Organizational Performance

Building a Data Warehouse

Business Analytics and Data Visualization. Decision Support Systems Chattrakul Sombattheera

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

ORACLE BUSINESS INTELLIGENCE SUITE ENTERPRISE EDITION PLUS

Republic Polytechnic School of Information and Communications Technology C355 Business Intelligence. Module Curriculum

STRATEGIC AND FINANCIAL PERFORMANCE USING BUSINESS INTELLIGENCE SOLUTIONS

OVERVIEW OF THE BUSINESS PERFORMANCE SOLUTIONS

Management Accountants and IT Professionals providing Better Information = BI = Business Intelligence. Peter Simons peter.simons@cimaglobal.

Meta-data and Data Mart solutions for better understanding for data and information in E-government Monitoring

5.5 Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall. Figure 5-2

SAP BusinessObjects Business Intelligence (BOBI) 4.1

Research on Airport Data Warehouse Architecture

OLAP Theory-English version

Big Data and Trusted Information

Introduction to Business Intelligence

Importance or the Role of Data Warehousing and Data Mining in Business Applications

MDM and Data Warehousing Complement Each Other

Getting it Right: How to Find the Right BI Package for the Right Situation Norma Waugh. RMOUG Training Days February 15-17, 2011

Part 22. Data Warehousing

RapidDecision EDW: THE BETTER WAY TO DATA WAREHOUSE

Design of Electricity & Energy Review Dashboard Using Business Intelligence and Data Warehouse

Turkish Journal of Engineering, Science and Technology

Online Courses. Version 9 Comprehensive Series. What's New Series

TRANSFORM BIG DATA INTO ACTIONABLE INFORMATION

Oracle BI Applications (BI Apps) is a prebuilt business intelligence solution.

Data Warehouse Architecture for Financial Institutes to Become Robust Integrated Core Financial System using BUID

Data Warehouse Modeling Industry Models

Data Warehouse: Introduction

Data W a Ware r house house and and OLAP II Week 6 1

Integrating GIS within the Enterprise Options, Considerations and Experiences

Introduction to Oracle Business Intelligence Standard Edition One. Mike Donohue Senior Manager, Product Management Oracle Business Intelligence

SAP BO 4.1 Online Training

8902 How to Generate Universes from SAP Sybase PowerDesigner. Revision:

CS2032 Data warehousing and Data Mining Unit II Page 1

Chapter 6 Basics of Data Integration. Fundamentals of Business Analytics RN Prasad and Seema Acharya

UltraQuest Cloud Server. White Paper Version 1.0

Establish and maintain Center of Excellence (CoE) around Data Architecture

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Data Warehouse Overview. Srini Rengarajan

SAP BusinessObjects SOLUTIONS FOR ORACLE ENVIRONMENTS

THE VALUE OF MIXING GIS AND BUSINESS INTELLIGENCE ARCHITECTURES

Data warehousing/dimensional modeling/ SAP BW 7.3 Concepts

Extending The Value of SAP with the SAP BusinessObjects Business Intelligence Platform Product Integration Roadmap

Presented by: Jose Chinchilla, MCITP

From Data Warehouse to Business Intelligence: The Michigan Journey

Driving Peak Performance IBM Corporation

When to consider OLAP?

CONCEPTUALIZING BUSINESS INTELLIGENCE ARCHITECTURE MOHAMMAD SHARIAT, Florida A&M University ROSCOE HIGHTOWER, JR., Florida A&M University

Transcription:

1

Data Search Searching and Finding information in Unstructured and Structured Data Sources Erik Fransen Senior Business Consultant 11.00-12.00 P.M. November, 3 IRM UK, DW/BI 2009, London Centennium BI expertisehuis The Hague, The Netherlands e.fransen@centennium.nl 2

Agenda Introduction; Industry models; Combining structured & unstructured data Pure Portal Index it all Structure it all Summary. 3

Erik Fransen Profile Background: Knowledge Engineering, Middlesex University; Expertise areas: Business Intelligence Knowledge engineering Knowledge & Content management Data warehousing Analytics CBIP. 4

5 Introduction

Combining BI with unstructured data Integrated access to relevant information ( provide complete picture ); Unstructured data like documents provide valuable context to numerical data; Customer complaints Competitor s press releases Marketing documents Insurance fraud analysis (i.e. claim statistics and claim forms); Competitive Intelligence (i.e. market share data and competitor news); Customer retention (i.e. sales data and customer complaints); Data Search acts as a bridge between structured and unstructured data. 6

SQL-99 SQL-03 SQL-70 Oracle-79 SQL-89 SQL-92 >80% Unstructured (un)structured data keeps growing. 2009 Cave paintings, Bone tools 40,000 BC Writing 3500 BC Paper 105 Printing 1450 2005 2001 2000 Electricity, Telephone 1870 Transistor 1947 Computing 1950 Internet (DARPA) Late 1960s The Web 1993 1999 GIGABYTES Source: Forrester 7

Industry Model: Bill Inmon s DW 2.0 Hold data at the lowest detail; Hold data to infinity; Have integrity of data and have online high-performance transaction processing; Tightly couple metadata to the data warehouse environment; Link structured data and unstructured data; Text Data 8

9 Industry Model: Information Access Architecture (Gartner)

10 Industry Model: Enterprise Search Platform (Forrester)

Data Search Scenarios Searching and Finding information in Unstructured and Structured Data Sources 11

Unstructured Middleware Portal Structured Master & Meta Data Global architecture OLTP DWH Data Marts Data Marts Cubes Reports OLAP Mining Financial Apps ODS Content Man System Fileservers Search Index Database Search Text Mining Visualisation Email Intranet/inte rnet 12

Unstructured Middleware Portal Structured Master & Meta Data OLTP Structure it all Three data search scenarios DWH Data Marts Data Marts Cubes Reports OLAP Mining Financial Apps Content Man System ODS Index it all Pure Portal Fileservers Search Index Database Search Text Mining Visualisation Email Intranet/inte rnet 13

Scenario 1: Pure Portal Many portlets, one user interface; Business user may manually combines content from several independent sources; Risk: too complex for user. 14

Unstructured Middleware Portal Structured Master & Meta Data 1: Pure Portal OLTP DWH Data Marts Data Marts Cubes Reports OLAP Mining Financial Apps Content Man System ODS Pure Portal Fileservers Search Index Database Search Text Mining Visualisation Email Intranet/inte rnet 15

Integrate news with BI information Source: Aruba 16

17 Structured BI info

18 and Photos, Files and Maps

Scenario 2: index it all Enterprise Search from one user interface; Business user knows what to look for and expects a complete picture as a result; Risk: Many irrelevant search results due to the nature of document indexing. 19

Unstructured Middleware Portal Structured Master & Meta Data 2: Index it all OLTP DWH Data Marts Data Marts Cubes Reports OLAP Mining Financial Apps Content Man System ODS Index it all Fileservers Search Index Database Search Text Mining Visualisation Email Intranet/inte rnet 20

User interface Scenario 2: Index it all Unstructured data sources Search index Search application BI report is indexed as if it was a document Structured data sources Data warehouse Architecture Reports BI application 21

Example: IBM Cognos 8 Go! Search Integration with enterprise search applications (IBM OmniFind, Google OneBox for Enterprise, Yahoo, Autonomy) Search results return all relevant structured content (reports, analyses, etc.) and unstructured content (Word documents, PDFs, et) within a single interface. 22

23 Example: IBM OmniFind

24 Example: IBM OmniFind

25 SAP BusinessObject Intelligent Search

SAP BusinessObject Intelligent Search 26 11/6/2

Scenario 3: Structure it all Generate structure using document warehousing and text mining; Business user knows exactly what to look for; Risk: Limited flexibility for user. 27

Unstructured Middleware Portal Structured Master & Meta Data OLTP Structure it all 3: Structure it all DWH Data Marts Data Marts Cubes Reports OLAP Mining Financial Apps ODS Content Man System Fileservers Search Index Database Search Text Mining Visualisation Email Intranet/inte rnet 28

Generating structure in document warehouse Identify Sources Retrieve Documents Preprocess Documents Text Mining Compile Metadata Sources are not fixed Iterative process, sources lead to new sources Internal sources retrieval, file servers, CMS/DMS External source retrieval, using crawlers, spiders Sources are not fixed Iterative process, sources lead to new sources Format documents in a consistent matter Files must be in suitable form for text analysis Linguistic analysis Key features are extracted Indexing documents Summarizing documents Carefully attach metadata to document Used for querying, matching, navigation support Store in document warehouse Source: Dan Sullivan 29 Data warehouse Architecture Combine (meta)data Document warehouse Architecture

Document warehouse Contains complete documents or URLs Metadata about documents: summaries, authors names, publication dates, titles, sources, keywords, etc. Translations of documents Thematic clustering of similar documents Topical or thematic indexes Document warehouse Architecture Extracted key features (structure) Dimensions and Facts, linked to documents, summaries etc. Combine with the data warehouse 30

BI reporting on dimensional model Dim Product Dim Customer Dim Action Sales Facts Call Facts Dim Competitor Dim Sales person Dim Time Dim Telco Term Data warehouse Document warehouse 31

Generate structure using text mining tools 32 Example taken from SPSS PASW Text Analytics, many other tools available: IBM, SAS, Oracle, SAP BO, Microsoft etc. etc.

Generating structure using UIMA Unstructured Information Management Architecture Originates from IBM, now Apache UIMA http://incubator.apache.org/uima/ Source: IBM UIMA is supported by all main BI vendors. 33

Example: Generating structure using UIMA Analyzed by a collection of text analytics Detected Semantic Entities and Relations Highlighted Represented in UIMA Common Analysis Structure (CAS) 34

Summary Growing business need for combining BI with unstructured data; Data Search bridges the gap between both worlds Scenario 1: Pure Portal Scenario 2: Index it all Scenario 3: Structure it all Scenarios can be combined. Questions? 35