CHAPTER-12. Analytical Characterization : Analysis of Attribute Relevance
|
|
|
- Priscilla Ramsey
- 9 years ago
- Views:
Transcription
1 CHAPTER-12 Analytical Characterization : Analysis of Attribute Relevance 12.1 Introduction 12.2 Methods of Attribute Relevance Analysis 12.3 Review Questions 12.4 References
2 12. Analytical Characterization : Analysis of Attribute Relevance 12.1 Introduction What if am not sure which attribute to include or class characterization and class comparison? I may end up specifying too many attributes, which could slow down the: system considerably. Measures of attribute relevance analysis can be used to help identify irrelevant or weakly relevant attributes that can be excluded from the concept description process. The incorporation of this preprocessing step into class characterization or comparison is referred to as analytical characterization or analytical comparison, respectively. This section describes a general method of attribute relevance analysis and its integration with attribute-oriented induction.
3 The first limitation of class characterization for multidimensional data analysis in Data warehouses and OLAP tools is the handling of complex objects. The second Limitation is the lack of an automated generalization process: the user must explicitly Tell the system which dimension should be included in the class characterization and to How high a level each dimension should be generalized. Actually, the user must specify each step of generalization or specification on any dimension. Usually, it is not difficult for a user to instruct a data mining system regarding how high level each dimension should be generalized. For example, users can set attribute- generalization thresholds for this, or specify which level a given dimension should reach,such as with the command generalize dimension location to the country level. Even without explicit user instruction, a default value such as 2 to 8 can be set by the data mining system, which would allow each dimension to be generalized to a level that contains only 2 to 8 distinct values. If the user is not satisfied with the current level of generalization, she can specify dimensions on which drill-down or roll-up operations should be applied. It is nontrivial, howesver, for users to determine which dimensions should be included in the analysis of class characteristics. Data relations often contain 50 to 100 attributes, and a user may have little knowledge regarding which attributes or dimensions should be selected for effective data mining. A user may include too few attributes in the analysis, causing the resulting mined descriptions to be incomplete. On the other hand, a user may introduce too many attributes for analysis (e.g., by indicating
4 in relevance to *, which includes all the attributes in the specified relations). Methods should be introduced to perform attribute (or dimension )relevance Analysis in order to filter out statistically irrelevant or weakly relevant attributes, and retain or even rank the most relevant attributes for the descriptive mining task at hand. Class characterization that includes the analysis of attribute/dimesnsion relevance is called analytical characterization. Class comparison that includes such analysis is called analytical comparison. Intuitively, an attribute or dimension is considered highly relevant with respect to a Given class if it is likely that the values of the attribute or dimension may be used to Distinguish the class from others. For example, it is unlikely that the color of an Automobile can be used to distinguish expensive from cheap cars, but the model, make, style, and number of cylinders are likely to be more relevant attributes. Moreover, even within the same dimension, different levels of concepts may have dramatically different powers for distinguishing a class from others. For example, in the birth_date dimension, birth_day and birth_month are unlikely to be relevant to the salary of employees. However, the birth_decade (i.e., age interval) may be highly relevant to the salary of employees. This implies that the analysis of dimension relevance should be performed at multi-levels of abstraction, and only the most relevant levels of a dimension should be included in the analysis.
5 Above we said that attribute/ dimension relevance is evaluated based on the ability of the attribute/ dimension to distinguish objects of a class from others. When mining a class comparison (or discrimination), the target class and the contrasting classes are Explicitly given in the mining query. The relevance analysis should be performed by Comparison of these classes, as we shall see below. However, when mining class Characteristics, there is only one class to be characterized. That is, no contrasting class is specified. It is therefore not obvious what the contrasting class should be for use in of comparable data in the database that excludes the set of data to be characterized. For example, to characterize graduate students, the contrasting class can be composed of the set of undergraduate students Methods of Attribute Relevance Analysis There have been many studies in machine learning, statistics, fuzzy and rough set Theories, and so on, on attribute relevance analysis. The general idea behind attribute Relevance analysis is to compute some measure that is used to quantify the relevance of an attribute with respect to a given class or concept. Such measures include information gain, the Gini index, uncertainity, and correlation coefficients. Here we introduce a method that integrates an information gain analysis technique With a dimension-based data analysis method. The resulting method removes the less
6 informative attributes, collecting the more informative ones for use in concept description analysis. How does the information gain calculation work? Let 5 be a set of training samples, where the class label of each sample is known. Each sample is in fact a tuple. One attribute is used to determine the class of the training samples. For instance, the Attribute status can be used to define the class label of each sample as either graduate or undergraduate. Suppose that there are m classes. Let S contain S i; samples of class C i, for i=1,., m. An arbitrary sample belongs to class C i, with probability S i /S, where S is the total number of samples in set S. The expected information needed to classify a given sample is I(S 1, S 2,..S m )= - (S i /S)(log 2)(S i /S) An attribute A with values { a i, a 2 > >a v) can be used to partition 5 into the Subsets { S i S z,, S v }, where S j contains those samples in 5 that have value a j of A. Let S; contain S y samples of class Q. The expected information based on this Partitioning by A is known as the entropy of A. It is the weighted average: E(A)= y j=1(s i j +.+S m j /S) I (Si j,.s m j ) The information gain obtained by this partitioning on A is defined by
7 Gain(A)=I(S 1,S 2,.S m )-E(A) In this approach to relevance analysis, we can compute the information gain for each of the attributes defining the samples in S. The attribute with the highest information gain is considered the most discriminating attribute of the given set. By computing the information gain for each attribute, we therefore obtain a ranking of the attributes. This ranking can be used for relevance analysis to select the attributes to be used in concept description. Attribute relevance analysis for concept description is performed as follows: Data Collection: Collect data for both the target class and the contrasting class by query processing. For class comparison, the user in the data-mining query provides both the target class and the contrasting class. For class characterization, the target class is the class to be characterized, whereas the contrasting class is the set of comparable data that are not in the target class. Preliminary relevance analysis using conservative AOI: This step identifies a Set of dimensions and attributes on which the selected relevance measure is to be Applied. Since different levels of a dimension may have dramatically different Relevance with respect to a given class, each attribute defining the conceptual levels of the dimension should be included in the relevance analysis in principle. Attribute-oriented induction (AOI)can be used to perform some preliminary
8 relevance analysis on the data by removing or generalizing attributes having a very large number of distinct values (such as name and phone#). Such attributes are unlikely to be found useful for concept description. To be conservative, the AOI performed here should employ attribute generalization thresholds that are set reasonably large so as to allow more (but not all)attributes to be considered in further relevance analysis by the selected measure (Step 3 below). The relation obtained by such an application of AOI is called the candidate relation of the mining task. Remove irrelevant and weakly attributes using the selected relevance analysis measure: Evaluate each attribute in the candidate relation using the selected relevance analysis measure. The relevance measure used in this step may be built into the data mining system or provided by the user. For example, the information gain measure described above may be used. The attributes are then sorted(i.e., ranked )according to their computed relevance to the data mining task. Attributes that are not relevant or are weakly relevant to the task are then removed. A threshold may be set to define weakly relevant. This step results in an initial Target class working relation and an initial contrasting class working relation. Generate the concept description using AOI: Perform AOI using a less Conservative set of attribute generalization thresholds. If the descriptive mining
9 Task is class characterization, only the initial target class working relation is included here. If the descriptive mining task is class comparison, both the initial target class working relation and the initial contrasting class working relation are included. The complexity of this procedure is the induction process is perfomed twice, that Is, in preliminary relevance analysis (Step 2)and on the initial working relation (Step4). The statistics used in attribute relevance analysis with the selected measure (Step 3) may be collected during the scanning of the database in Step Review Questions 1Explain Analytical Characterization? 2 Methods of Attribute Relevance Analysis? 12.4 References [1]. Data Mining Techniques, Arun k pujari 1 st Edition [2].Data warehousung,data Mining and OLAP, Alex Berson,smith.j. Stephen
10 [3].Data Mining Concepts and Techniques,Jiawei Han and MichelineKamber [4]Data Mining Introductory and Advanced topics, Margaret H Dunham PEA [5] The Data Warehouse lifecycle toolkit, Ralph Kimball Wiley student Edition
CHAPTER 4 Data Warehouse Architecture
CHAPTER 4 Data Warehouse Architecture 4.1 Data Warehouse Architecture 4.2 Three-tier data warehouse architecture 4.3 Types of OLAP servers: ROLAP versus MOLAP versus HOLAP 4.4 Further development of Data
CHAPTER-24 Mining Spatial Databases
CHAPTER-24 Mining Spatial Databases 24.1 Introduction 24.2 Spatial Data Cube Construction and Spatial OLAP 24.3 Spatial Association Analysis 24.4 Spatial Clustering Methods 24.5 Spatial Classification
CHAPTER 3. Data Warehouses and OLAP
CHAPTER 3 Data Warehouses and OLAP 3.1 Data Warehouse 3.2 Differences between Operational Systems and Data Warehouses 3.3 A Multidimensional Data Model 3.4Stars, snowflakes and Fact Constellations: 3.5
(b) How data mining is different from knowledge discovery in databases (KDD)? Explain.
Q2. (a) List and describe the five primitives for specifying a data mining task. Data Mining Task Primitives (b) How data mining is different from knowledge discovery in databases (KDD)? Explain. IETE
Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1
Slide 29-1 Chapter 29 Overview of Data Warehousing and OLAP Chapter 29 Outline Purpose of Data Warehousing Introduction, Definitions, and Terminology Comparison with Traditional Databases Characteristics
IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH
IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH Kalinka Mihaylova Kaloyanova St. Kliment Ohridski University of Sofia, Faculty of Mathematics and Informatics Sofia 1164, Bulgaria
COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction
COMP3420: Advanced Databases and Data Mining Classification and prediction: Introduction and Decision Tree Induction Lecture outline Classification versus prediction Classification A two step process Supervised
Knowledge Discovery and Data Mining. Structured vs. Non-Structured Data
Knowledge Discovery and Data Mining Unit # 2 1 Structured vs. Non-Structured Data Most business databases contain structured data consisting of well-defined fields with numeric or alphanumeric values.
Data Mining for Knowledge Management. Classification
1 Data Mining for Knowledge Management Classification Themis Palpanas University of Trento http://disi.unitn.eu/~themis Data Mining for Knowledge Management 1 Thanks for slides to: Jiawei Han Eamonn Keogh
Data Mining: Concepts and Techniques. Jiawei Han. Micheline Kamber. Simon Fräser University К MORGAN KAUFMANN PUBLISHERS. AN IMPRINT OF Elsevier
Data Mining: Concepts and Techniques Jiawei Han Micheline Kamber Simon Fräser University К MORGAN KAUFMANN PUBLISHERS AN IMPRINT OF Elsevier Contents Foreword Preface xix vii Chapter I Introduction I I.
Cognos 8 Best Practices
Northwestern University Business Intelligence Solutions Cognos 8 Best Practices Volume 2 Dimensional vs Relational Reporting Reporting Styles Relational Reports are composed primarily of list reports,
Data Preprocessing. Week 2
Data Preprocessing Week 2 Topics Data Types Data Repositories Data Preprocessing Present homework assignment #1 Team Homework Assignment #2 Read pp. 227 240, pp. 250 250, and pp. 259 263 the text book.
Data mining in the e-learning domain
Data mining in the e-learning domain The author is Education Liaison Officer for e-learning, Knowsley Council and University of Liverpool, Wigan, UK. Keywords Higher education, Classification, Data encapsulation,
Foundations of Business Intelligence: Databases and Information Management
Foundations of Business Intelligence: Databases and Information Management Content Problems of managing data resources in a traditional file environment Capabilities and value of a database management
Data Warehousing and Decision Support. Introduction. Three Complementary Trends. Chapter 23, Part A
Data Warehousing and Decision Support Chapter 23, Part A Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1 Introduction Increasingly, organizations are analyzing current and historical
OLAP & DATA MINING CS561-SPRING 2012 WPI, MOHAMED ELTABAKH
OLAP & DATA MINING CS561-SPRING 2012 WPI, MOHAMED ELTABAKH 1 Online Analytic Processing OLAP 2 OLAP OLAP: Online Analytic Processing OLAP queries are complex queries that Touch large amounts of data Discover
Data Mining. 1 Introduction 2 Data Mining methods. Alfred Holl Data Mining 1
Data Mining 1 Introduction 2 Data Mining methods Alfred Holl Data Mining 1 1 Introduction 1.1 Motivation 1.2 Goals and problems 1.3 Definitions 1.4 Roots 1.5 Data Mining process 1.6 Epistemological constraints
Classification and Prediction
Classification and Prediction Slides for Data Mining: Concepts and Techniques Chapter 7 Jiawei Han and Micheline Kamber Intelligent Database Systems Research Lab School of Computing Science Simon Fraser
Data Mining and Database Systems: Where is the Intersection?
Data Mining and Database Systems: Where is the Intersection? Surajit Chaudhuri Microsoft Research Email: [email protected] 1 Introduction The promise of decision support systems is to exploit enterprise
Course 103402 MIS. Foundations of Business Intelligence
Oman College of Management and Technology Course 103402 MIS Topic 5 Foundations of Business Intelligence CS/MIS Department Organizing Data in a Traditional File Environment File organization concepts Database:
Practical Applications of DATA MINING. Sang C Suh Texas A&M University Commerce JONES & BARTLETT LEARNING
Practical Applications of DATA MINING Sang C Suh Texas A&M University Commerce r 3 JONES & BARTLETT LEARNING Contents Preface xi Foreword by Murat M.Tanik xvii Foreword by John Kocur xix Chapter 1 Introduction
OLAP Theory-English version
OLAP Theory-English version On-Line Analytical processing (Business Intelligence) [Ing.J.Skorkovský,CSc.] Department of corporate economy Agenda The Market Why OLAP (On-Line-Analytic-Processing Introduction
Chapter ML:XI. XI. Cluster Analysis
Chapter ML:XI XI. Cluster Analysis Data Mining Overview Cluster Analysis Basics Hierarchical Cluster Analysis Iterative Cluster Analysis Density-Based Cluster Analysis Cluster Evaluation Constrained Cluster
Data, Measurements, Features
Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are
Dimensional Modeling for Data Warehouse
Modeling for Data Warehouse Umashanker Sharma, Anjana Gosain GGS, Indraprastha University, Delhi Abstract Many surveys indicate that a significant percentage of DWs fail to meet business objectives or
European Archival Records and Knowledge Preservation Database Archiving in the E-ARK Project
European Archival Records and Knowledge Preservation Database Archiving in the E-ARK Project Janet Delve, University of Portsmouth Kuldar Aas, National Archives of Estonia Rainer Schmidt, Austrian Institute
Business Intelligence Systems
12 Business Intelligence Systems Business Intelligence Systems Bogdan NEDELCU University of Economic Studies, Bucharest, Romania [email protected] The aim of this article is to show the importance
Chapter 20: Data Analysis
Chapter 20: Data Analysis Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 20: Data Analysis Decision Support Systems Data Warehousing Data Mining Classification
Building Data Cubes and Mining Them. Jelena Jovanovic Email: [email protected]
Building Data Cubes and Mining Them Jelena Jovanovic Email: [email protected] KDD Process KDD is an overall process of discovering useful knowledge from data. Data mining is a particular step in the
Extend Table Lens for High-Dimensional Data Visualization and Classification Mining
Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du [email protected] University of British Columbia
DATA VERIFICATION IN ETL PROCESSES
KNOWLEDGE ENGINEERING: PRINCIPLES AND TECHNIQUES Proceedings of the International Conference on Knowledge Engineering, Principles and Techniques, KEPT2007 Cluj-Napoca (Romania), June 6 8, 2007, pp. 282
Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining
Data Mining: Exploring Data Lecture Notes for Chapter 3 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 8/05/2005 1 What is data exploration? A preliminary
CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 16 - Data Warehousing
CSE 544 Principles of Database Management Systems Magdalena Balazinska Fall 2007 Lecture 16 - Data Warehousing Class Projects Class projects are going very well! Project presentations: 15 minutes On Wednesday
DATA CUBES E0 261. Jayant Haritsa Computer Science and Automation Indian Institute of Science. JAN 2014 Slide 1 DATA CUBES
E0 261 Jayant Haritsa Computer Science and Automation Indian Institute of Science JAN 2014 Slide 1 Introduction Increasingly, organizations are analyzing historical data to identify useful patterns and
DSS based on Data Warehouse
DSS based on Data Warehouse C_13 / 6.01.2015 Decision support system is a complex system engineering. At the same time, research DW composition, DW structure and DSS Architecture based on DW, puts forward
Multi-dimensional index structures Part I: motivation
Multi-dimensional index structures Part I: motivation 144 Motivation: Data Warehouse A definition A data warehouse is a repository of integrated enterprise data. A data warehouse is used specifically for
Knowledge Discovery and Data. Data Mining vs. OLAP
Knowledge Discovery and Data Mining Data Mining vs. OLAP Sajjad Haider Spring 2010 1 Acknowledgement All the material in this presentation is taken from the Internet. A simple search of Data Mining vs.
131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
2074 : Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000
2074 : Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000 Introduction This course provides students with the knowledge and skills necessary to design, implement, and deploy OLAP
Introduction to Data Mining
Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:
MINING CLICKSTREAM-BASED DATA CUBES
MINING CLICKSTREAM-BASED DATA CUBES Ronnie Alves and Orlando Belo Departament of Informatics,School of Engineering, University of Minho Campus de Gualtar, 4710-057 Braga, Portugal Email: {alvesrco,obelo}@di.uminho.pt
Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT Learning Objectives
Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT Learning Objectives Describe how the problems of managing data resources in a traditional file environment are solved
SPATIAL DATA CLASSIFICATION AND DATA MINING
, pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal
Data Mining: Data Preprocessing. I211: Information infrastructure II
Data Mining: Data Preprocessing I211: Information infrastructure II 10 What is Data? Collection of data objects and their attributes Attributes An attribute is a property or characteristic of an object
Database Marketing, Business Intelligence and Knowledge Discovery
Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski
Introduction. A. Bellaachia Page: 1
Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.
OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP
Data Warehousing and End-User Access Tools OLAP and Data Mining Accompanying growth in data warehouses is increasing demands for more powerful access tools providing advanced analytical capabilities. Key
An Introduction to Data Warehousing. An organization manages information in two dominant forms: operational systems of
An Introduction to Data Warehousing An organization manages information in two dominant forms: operational systems of record and data warehouses. Operational systems are designed to support online transaction
Introduction to Data Mining
Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association
FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM
International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 34-48 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT
The Microsoft Business Intelligence 2010 Stack Course 50511A; 5 Days, Instructor-led
The Microsoft Business Intelligence 2010 Stack Course 50511A; 5 Days, Instructor-led Course Description This instructor-led course provides students with the knowledge and skills to develop Microsoft End-to-
Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland
Data Mining and Knowledge Discovery in Databases (KDD) State of the Art Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland 1 Conference overview 1. Overview of KDD and data mining 2. Data
Implementing Data Models and Reports with Microsoft SQL Server 2012 MOC 10778
Implementing Data Models and Reports with Microsoft SQL Server 2012 MOC 10778 Course Outline Module 1: Introduction to Business Intelligence and Data Modeling This module provides an introduction to Business
Designing a Dimensional Model
Designing a Dimensional Model Erik Veerman Atlanta MDF member SQL Server MVP, Microsoft MCT Mentor, Solid Quality Learning Definitions Data Warehousing A subject-oriented, integrated, time-variant, and
Efficient Integration of Data Mining Techniques in Database Management Systems
Efficient Integration of Data Mining Techniques in Database Management Systems Fadila Bentayeb Jérôme Darmont Cédric Udréa ERIC, University of Lyon 2 5 avenue Pierre Mendès-France 69676 Bron Cedex France
An Approach for Facilating Knowledge Data Warehouse
International Journal of Soft Computing Applications ISSN: 1453-2277 Issue 4 (2009), pp.35-40 EuroJournals Publishing, Inc. 2009 http://www.eurojournals.com/ijsca.htm An Approach for Facilating Knowledge
Basics of Dimensional Modeling
Basics of Dimensional Modeling Data warehouse and OLAP tools are based on a dimensional data model. A dimensional model is based on dimensions, facts, cubes, and schemas such as star and snowflake. Dimensional
DATA WAREHOUSE E KNOWLEDGE DISCOVERY
DATA WAREHOUSE E KNOWLEDGE DISCOVERY Prof. Fabio A. Schreiber Dipartimento di Elettronica e Informazione Politecnico di Milano DATA WAREHOUSE (DW) A TECHNIQUE FOR CORRECTLY ASSEMBLING AND MANAGING DATA
Professor Anita Wasilewska. Classification Lecture Notes
Professor Anita Wasilewska Classification Lecture Notes Classification (Data Mining Book Chapters 5 and 7) PART ONE: Supervised learning and Classification Data format: training and test data Concept,
Dimensional Modeling and E-R Modeling In. Joseph M. Firestone, Ph.D. White Paper No. Eight. June 22, 1998
1 of 9 5/24/02 3:47 PM Dimensional Modeling and E-R Modeling In The Data Warehouse By Joseph M. Firestone, Ph.D. White Paper No. Eight June 22, 1998 Introduction Dimensional Modeling (DM) is a favorite
A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH
205 A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH ABSTRACT MR. HEMANT KUMAR*; DR. SARMISTHA SARMA** *Assistant Professor, Department of Information Technology (IT), Institute of Innovation in Technology
Customer Classification And Prediction Based On Data Mining Technique
Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor
Implementing Data Models and Reports with Microsoft SQL Server
Course 20466C: Implementing Data Models and Reports with Microsoft SQL Server Course Details Course Outline Module 1: Introduction to Business Intelligence and Data Modeling As a SQL Server database professional,
Overview. Data Warehousing and Decision Support. Introduction. Three Complementary Trends. Data Warehousing. An Example: The Store (e.g.
Overview Data Warehousing and Decision Support Chapter 25 Why data warehousing and decision support Data warehousing and the so called star schema MOLAP versus ROLAP OLAP, ROLLUP AND CUBE queries Design
Data Mining for Successful Healthcare Organizations
Data Mining for Successful Healthcare Organizations For successful healthcare organizations, it is important to empower the management and staff with data warehousing-based critical thinking and knowledge
Foundations of Business Intelligence: Databases and Information Management
Foundations of Business Intelligence: Databases and Information Management Problem: HP s numerous systems unable to deliver the information needed for a complete picture of business operations, lack of
Why Business Intelligence
Why Business Intelligence Ferruccio Ferrando z IT Specialist Techline Italy March 2011 page 1 di 11 1.1 The origins In the '50s economic boom, when demand and production were very high, the only concern
Data Warehousing and Data Mining
Data Warehousing and Data Mining Winter Semester 2012/2013 Free University of Bozen, Bolzano DM Lecturer: Mouna Kacimi [email protected] http://www.inf.unibz.it/dis/teaching/dwdm/index.html Organization
1. What are the uses of statistics in data mining? Statistics is used to Estimate the complexity of a data mining problem. Suggest which data mining
1. What are the uses of statistics in data mining? Statistics is used to Estimate the complexity of a data mining problem. Suggest which data mining techniques are most likely to be successful, and Identify
STATISTICA. Financial Institutions. Case Study: Credit Scoring. and
Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT
Data quality in Accounting Information Systems
Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania
M2074 - Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000 5 Day Course
Module 1: Introduction to Data Warehousing and OLAP Introducing Data Warehousing Defining OLAP Solutions Understanding Data Warehouse Design Understanding OLAP Models Applying OLAP Cubes At the end of
T-61.6010 Non-discriminatory Machine Learning
T-61.6010 Non-discriminatory Machine Learning Seminar 1 Indrė Žliobaitė Aalto University School of Science, Department of Computer Science Helsinki Institute for Information Technology (HIIT) University
Data W a Ware r house house and and OLAP II Week 6 1
Data Warehouse and OLAP II Week 6 1 Team Homework Assignment #8 Using a data warehousing tool and a data set, play four OLAP operations (Roll up (drill up), Drill down (roll down), Slice and dice, Pivot
Data Mining Introduction
Data Mining Introduction Organization Lectures Mondays and Thursdays from 10:30 to 12:30 Lecturer: Mouna Kacimi Office hours: appointment by email Labs Thursdays from 14:00 to 16:00 Teaching Assistant:
COM CO P 5318 Da t Da a t Explora Explor t a ion and Analysis y Chapte Chapt r e 3
COMP 5318 Data Exploration and Analysis Chapter 3 What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping
3. Mathematical Induction
3. MATHEMATICAL INDUCTION 83 3. Mathematical Induction 3.1. First Principle of Mathematical Induction. Let P (n) be a predicate with domain of discourse (over) the natural numbers N = {0, 1,,...}. If (1)
Regular Expressions and Automata using Haskell
Regular Expressions and Automata using Haskell Simon Thompson Computing Laboratory University of Kent at Canterbury January 2000 Contents 1 Introduction 2 2 Regular Expressions 2 3 Matching regular expressions
COURSE SYLLABUS COURSE TITLE:
1 COURSE SYLLABUS COURSE TITLE: FORMAT: CERTIFICATION EXAMS: 55043AC Microsoft End to End Business Intelligence Boot Camp Instructor-led None This course syllabus should be used to determine whether the
Learning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal
Learning Example Chapter 18: Learning from Examples 22c:145 An emergency room in a hospital measures 17 variables (e.g., blood pressure, age, etc) of newly admitted patients. A decision is needed: whether
When to consider OLAP?
When to consider OLAP? Author: Prakash Kewalramani Organization: Evaltech, Inc. Evaltech Research Group, Data Warehousing Practice. Date: 03/10/08 Email: [email protected] Abstract: Do you need an OLAP
Decision Support. Chapter 23. Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1
Decision Support Chapter 23 Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1 Introduction Increasingly, organizations are analyzing current and historical data to identify useful
DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM
INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS ISSN 2320-7345 DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM M. Mayilvaganan 1, S. Aparna 2 1 Associate
Credit Risk Models. August 24 26, 2010
Credit Risk Models August 24 26, 2010 AGENDA 1 st Case Study : Credit Rating Model Borrowers and Factoring (Accounts Receivable Financing) pages 3 10 2 nd Case Study : Credit Scoring Model Automobile Leasing
Welcome to the topic on creating key performance indicators in SAP Business One, release 9.1 version for SAP HANA.
Welcome to the topic on creating key performance indicators in SAP Business One, release 9.1 version for SAP HANA. 1 In this topic, you will learn how to: Use Key Performance Indicators (also known as
Data Warehouses in the Path from Databases to Archives
Data Warehouses in the Path from Databases to Archives Gabriel David FEUP / INESC-Porto This position paper describes a research idea submitted for funding at the Portuguese Research Agency. Introduction
A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS
A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant
A Brief Tutorial on Database Queries, Data Mining, and OLAP
A Brief Tutorial on Database Queries, Data Mining, and OLAP Lutz Hamel Department of Computer Science and Statistics University of Rhode Island Tyler Hall Kingston, RI 02881 Tel: (401) 480-9499 Fax: (401)
Anwendersoftware Anwendungssoftwares a. Data-Warehouse-, Data-Mining- and OLAP-Technologies. Online Analytic Processing
Anwendungssoftwares a Data-Warehouse-, Data-Mining- and OLAP-Technologies Online Analytic Processing Online Analytic Processing OLAP Online Analytic Processing Technologies and tools that support (ad-hoc)
Foundations of Business Intelligence: Databases and Information Management
Chapter 6 Foundations of Business Intelligence: Databases and Information Management 6.1 2010 by Prentice Hall LEARNING OBJECTIVES Describe how the problems of managing data resources in a traditional
A Pseudo Nearest-Neighbor Approach for Missing Data Recovery on Gaussian Random Data Sets
University of Nebraska at Omaha DigitalCommons@UNO Computer Science Faculty Publications Department of Computer Science -2002 A Pseudo Nearest-Neighbor Approach for Missing Data Recovery on Gaussian Random
A pseudo-nearest-neighbor approach for missing data recovery on Gaussian random data sets
Pattern Recognition Letters 23(2002) 1613 1622 www.elsevier.com/locate/patrec A pseudo-nearest-neighbor approach for missing data recovery on Gaussian random data sets Xiaolu Huang, Qiuming Zhu * Digital
