3/17/2009. Knowledge Management BIKM eclassifier Integrated BIKM Tools



Similar documents
CHAPTER SIX DATA. Business Intelligence The McGraw-Hill Companies, All Rights Reserved

Republic Polytechnic School of Information and Communications Technology C355 Business Intelligence. Module Curriculum

Data Warehouse Snowflake Design and Performance Considerations in Business Analytics

Fluency With Information Technology CSE100/IMT100

A Knowledge Management Framework Using Business Intelligence Solutions

HYPERION MASTER DATA MANAGEMENT SOLUTIONS FOR IT

SAS Business Intelligence Online Training

Course DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

IST722 Data Warehousing

OLAP and OLTP. AMIT KUMAR BINDAL Associate Professor M M U MULLANA

OLAP. Business Intelligence OLAP definition & application Multidimensional data representation

OLAP Theory-English version

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

B.Sc (Computer Science) Database Management Systems UNIT-V

The Role of the BI Competency Center in Maximizing Organizational Performance

SAS BI Course Content; Introduction to DWH / BI Concepts

Top 10 Business Intelligence (BI) Requirements Analysis Questions

IBM Content Analytics adds value to Cognos BI

2074 : Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000

IT and CRM A basic CRM model Data source & gathering system Database system Data warehouse Information delivery system Information users

Data W a Ware r house house and and OLAP II Week 6 1

Designing a Dimensional Model

Distance Learning and Examining Systems

Employee Survey Analysis

QAD Business Intelligence

Delivering Smart Answers!

DATA WAREHOUSING AND OLAP TECHNOLOGY

Data Mining + Business Intelligence. Integration, Design and Implementation

B. 3 essay questions. Samples of potential questions are available in part IV. This list is not exhaustive it is just a sample.

ETPL Extract, Transform, Predict and Load

Data Warehousing and Data Mining in Business Applications

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

Big Data: Rethinking Text Visualization

Basics of Dimensional Modeling

Cincom Business Intelligence Solutions

STRATEGIC AND FINANCIAL PERFORMANCE USING BUSINESS INTELLIGENCE SOLUTIONS

1. OLAP is an acronym for a. Online Analytical Processing b. Online Analysis Process c. Online Arithmetic Processing d. Object Linking and Processing

70-467: Designing Business Intelligence Solutions with Microsoft SQL Server

Delivering Business Intelligence With Microsoft SQL Server 2005 or 2008 HDT922 Five Days

ORACLE TAX ANALYTICS. The Solution. Oracle Tax Data Model KEY FEATURES

Introduction to Data Mining

LEARNING SOLUTIONS website milner.com/learning phone

Federico Rajola. Customer Relationship. Management in the. Financial Industry. Organizational Processes and. Technology Innovation.

MDM and Data Warehousing Complement Each Other

<Insert Picture Here> Extending Hyperion BI with the Oracle BI Server

The Future of Business Analytics is Now! 2013 IBM Corporation

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

Database Marketing, Business Intelligence and Knowledge Discovery

Analytics with Excel and ARQUERY for Oracle OLAP

Business Intelligence: Effective Decision Making

Integrated Data Mining and Knowledge Discovery Techniques in ERP

BUILDING OLAP TOOLS OVER LARGE DATABASES

CONCEPTUALIZING BUSINESS INTELLIGENCE ARCHITECTURE MOHAMMAD SHARIAT, Florida A&M University ROSCOE HIGHTOWER, JR., Florida A&M University

MS 20467: Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Adobe Insight, powered by Omniture

8. Business Intelligence Reference Architectures and Patterns

Alejandro Vaisman Esteban Zimanyi. Data. Warehouse. Systems. Design and Implementation. ^ Springer

Building Data Cubes and Mining Them. Jelena Jovanovic

Sizing Logical Data in a Data Warehouse A Consistent and Auditable Approach

John R. Vacca INSIDE

Enterprise Data Warehouse (EDW) UC Berkeley Peter Cava Manager Data Warehouse Services October 5, 2006

CHAPTER 4 Data Warehouse Architecture

Tracking System for GPS Devices and Mining of Spatial Data

Data Mining for Everyone

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1

Data Testing on Business Intelligence & Data Warehouse Projects

Pentaho Data Mining Last Modified on January 22, 2007

Chapter 6 - Enhancing Business Intelligence Using Information Systems

Part 22. Data Warehousing

Rational Reporting. Module 3: IBM Rational Insight and IBM Cognos Data Manager

Data Warehousing and OLAP Technology for Knowledge Discovery

SAP BusinessObjects SOLUTIONS FOR ORACLE ENVIRONMENTS

Developing Business Intelligence and Data Visualization Applications with Web Maps

Oracle Database 11g for Data Warehousing and Business Intelligence. An Oracle White Paper July 2007

Contrasting Xpriori Insight with Traditional Statistical Analysis, Data Mining and Online Analytical Processing

Building a Data Warehouse

Web Mining. Margherita Berardi LACAM. Dipartimento di Informatica Università degli Studi di Bari

BUSINESSOBJECTS DATA INTEGRATOR

Winning with an Intuitive Business Intelligence Solution for Midsize Companies

DATA WAREHOUSING - OLAP

Prophix and Business Intelligence. A white paper prepared by Prophix Software 2012

The Business Value of Predictive Analytics

Social Media Implementations

Analance Data Integration Technical Whitepaper

SPATIAL DATA CLASSIFICATION AND DATA MINING

Foundations of Business Intelligence: Databases and Information Management

ORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process

Practical meta data solutions for the large data warehouse

Using Tableau Software with Hortonworks Data Platform

UNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX

Master Data Management. Zahra Mansoori

Business Intelligence, Analytics & Reporting: Glossary of Terms

Rational Reporting. Module 2: IBM Rational Insight Data Warehouse

Transcription:

Paper by W. F. Cody J. T. Kreulen V. Krishna W. S. Spangler Presentation by Dylan Chi Discussion by Debojit Dhar THE INTEGRATION OF BUSINESS INTELLIGENCE AND KNOWLEDGE MANAGEMENT BUSINESS INTELLIGENCE Business intelligence (BI) refers to skills, technologies, applications and practices used to help a business acquire a better understanding of its commercial context. Business intelligence may also refer to the collected information itself. --Wikipedia BUSINESS INTELLIGENCE Business intelligence technology has coalesced around the use of two technologies data warehousing on-line analytical processing (OLAP). DATA WAREHOUSING Data warehousing is a systematic approach to collecting relevant business data into a single repository, where it is organized and validated so that it can be analyzed and presented in a form that is useful for business decision-making. DATA WAREHOUSING The various sources for the relevant business data are referred to as the operational data stores (ODS). The data are extracted, transformed, and loaded (ETL) from the ODS systems into a data mart. 1

OLAP Online analytical processing, or OLAP, is an approach to quickly answer multi-dimensional analytical queries. --Wikipedia OLAP In the data mart, the data are modeled as an OLAP cube (multidimensional model) Multidimensional model supports flexible drilldown and roll-up analyses OLAP CUBE KNOWLEDGE MANAGEMENT Knowledge Management (KM) comprises a range of practices used in an organization to identify, create, represent, distribute and enable adoption of insights and experiences. Such insights and experiences comprise knowledge, either embodied in individuals or embedded in organizational processes or practice. --Wikipedia KNOWLEDGE MANAGEMENT In this context, it is used for the management and analysis of unstructured information, particularly text documents. Textual information sources Business documents, e-mail, news and press articles, technical journals, patents, conference proceedings, business contracts, government reports, regulatory filings, discussion groups, problem report databases, sales and support notes, web. 2

Does the authors definition of business intelligence agree with yours? Why or why not? What business intelligence applications can you think of that aren t mentioned in the paper? BIKM The authors believe that over time techniques from both BI and KM will blend New techniques will seamlessly span the analysis of both data and text BIKM PROBLEMS Understanding sales effectiveness Products, sales representatives, customers Sales techniques Improving support and warranty analysis Customer complaints Relating CRM to profitability hidden cost Complete picture ENVIRONMENTAL ISSUES Text information sits inside the same database Textual information is in systems distinct from the ODS systems The sources of text to relate to a business data analysis are not known EXAMPLE A business analyst explore a revenue cube and detect a downward movement in revenues for a software product in some part of the United States. The data cube shows the phenomenon but does not provide any explanation for it 3

EXAMPLE To understand the phenomenon, some text sources could be used to extract valuable information Enterprise-specific information Service call logs about the product Competitive intelligence reports Purchased text information Public documents in Web forms Discussions about products Do you think integrating BI and KM to be a good idea? Do you think the ideas in the paper made/did not make it to the mainstream BI tools? Have you come across tools that use the BIKM concept? ECLASSIFIER is an application that can quickly analyze a large collection of documents and utilize multiple algorithms, visualizations, and metrics to create and to maintain a taxonomy. It is very difficult to automatically produce a satisfactory taxonomy for a diverse set of users without allowing human intervention. DOCUMENT REPRESENTATION Feature space of terms and phrases The feature space is obtained by counting the occurrence of terms and phrases in each document Stop-word lists Synonym list, stock phrase list, include word list Vector of weighted frequencies Dictionary tool TAXONOMY GENERATION Automatically create an initial categorization or taxonomy k-means algorithm Interactive, query-based clustering Seeds categories based on a set of keywords Tests out the queries Refines the clusters based on the observed results 4

TAXONOMY EVALUATION TAXONOMY VISUALIZATION Once we have an initial taxonomy of the documents, eclassifier provides the means to understand and to evaluate it. Category label is generated using a term-coverage algorithm that identifies dominant terms in the feature space. Metrics Size, cohesion, distinctness CLASSIFICATION Assign additional documents to the taxonomy as they become available creates a batch classifier to process the additional documents Nearest centroid Native Bayes multivariate Native Bayes multinomial Decision tree ANALYSIS AND REPORTING FAQ analysis Discovery of correlations Chi-squared test Continuous variables Using a generated taxonomy to compare document collections The main tasks of eclassifier can be represented as: Taxonomy generation Taxonomy and category evaluation Taxonomy visualization Classification Analysis and reporting Which of these do you think is most important and why? 5

INTEGRATION PARADIGM Text is ultimately associated with business data records to enhance the understanding of the data We might strive to achieve a tighter integration of the text information with the associated data Using an OLAP multidimensional data model as the integrating mechanism INTEGRATING TEXT INFORMATION Find attributes in the documents that can be used to link them to the data, or find attributes in the documents that can be used as additional dimensions to deepen the understanding of the data Compute quantitative values from the documents INTEGRATED BIKM TOOLS Apply the OLAP data model to text documents, creating a document warehouse Allow users to explore data cubes with a star schema and consists of a report view and navigational controls DOCUMENT WAREHOUSING The fact table granularity is a document The dimension tables hold the attributes of the document SHARED DIMENSION DATA MODEL SHARED DIMENSIONS We use star schemas to organize and analyze both data and document cubes Providing a mechanism to link them will allow deeper analysis and thereby provide greater value The key to achieving it is to directly link the data to the documents through shared dimensions 6

DYNAMIC DIMENSIONS The new taxonomy can be made available to the document warehouse by creating a corresponding dimension table to represent the taxonomy and then populating an added column in the fact table, associating all known document with the newly published dimension Do you think it is possible to create a consistent taxonomy for documents using the concepts detailed in the paper? What changes would you suggest to come up with a more useful classification? Is it a good idea to group documents under a single hierarchy or class? Thank you 7