DATA MINING - SELECTED TOPICS



Similar documents
Introduction to Data Mining

Introduction. A. Bellaachia Page: 1

CS590D: Data Mining Chris Clifton

Analysis One Code Desc. Transaction Amount. Fiscal Period

CHAPTER-24 Mining Spatial Databases

Qi Liu Rutgers Business School ISACA New York 2013

Data Mining: Concepts and Techniques

AT&T Global Network Client for Windows Product Support Matrix January 29, 2015

Case 2:08-cv ABC-E Document 1-4 Filed 04/15/2008 Page 1 of 138. Exhibit 8

Welcome. Data Mining: Updates in Technologies. Xindong Wu. Colorado School of Mines Golden, Colorado 80401, USA

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

COMPARISON OF FIXED & VARIABLE RATES (25 YEARS) CHARTERED BANK ADMINISTERED INTEREST RATES - PRIME BUSINESS*

COMPARISON OF FIXED & VARIABLE RATES (25 YEARS) CHARTERED BANK ADMINISTERED INTEREST RATES - PRIME BUSINESS*

Data Mining is sometimes referred to as KDD and DM and KDD tend to be used as synonyms

Data Mining - Introduction

Using Data Mining for Mobile Communication Clustering and Characterization

APPROACHABLE ANALYTICS MAKING SENSE OF DATA

Knowledge Discovery Process and Data Mining - Final remarks

Data Mining. Knowledge Discovery, Data Warehousing and Machine Learning Final remarks. Lecturer: JERZY STEFANOWSKI

Data Mining System, Functionalities and Applications: A Radical Review

Data Mining + Business Intelligence. Integration, Design and Implementation

CENTERPOINT ENERGY TEXARKANA SERVICE AREA GAS SUPPLY RATE (GSR) JULY Small Commercial Service (SCS-1) GSR

Sunnie Chung. Cleveland State University

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

An Overview of Knowledge Discovery Database and Data mining Techniques

Fluency With Information Technology CSE100/IMT100

ADVANCES IN KNOWLEDGE DISCOVERY IN DATABASES

Energy Savings from Business Energy Feedback

Database Marketing, Business Intelligence and Knowledge Discovery

Data Mining Solutions for the Business Environment

not possible or was possible at a high cost for collecting the data.

Enhanced Vessel Traffic Management System Booking Slots Available and Vessels Booked per Day From 12-JAN-2016 To 30-JUN-2017

Data Mining. Vera Goebel. Department of Informatics, University of Oslo

(b) How data mining is different from knowledge discovery in databases (KDD)? Explain.

What is Data Mining? Data Mining (Knowledge discovery in database) Data mining: Basic steps. Mining tasks. Classification: YES, NO

Introduction. Introduction. Spatial Data Mining: Definition WHAT S THE DIFFERENCE?

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

2015 Ohio MemberSource Newsletter Targeted Production Schedule (Laura Huff/Courtney Stewart)

International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April ISSN

A Data Mining Tutorial

CAFIS REPORT

Data Mining: Concepts and Techniques. Jiawei Han. Micheline Kamber. Simon Fräser University К MORGAN KAUFMANN PUBLISHERS. AN IMPRINT OF Elsevier

Data Mining for Fun and Profit

Tropical Horticulture: Lecture 2

Information Management course

Important Dates Calendar FALL

Data Mining as Part of Knowledge Discovery in Databases (KDD)

Data Mining and Application in Accounting and Auditing

Introduction to Data Mining

P/T 2B: 2 nd Half of Term (8 weeks) Start: 24-AUG-2015 End: 18-OCT-2015 Start: 19-OCT-2015 End: 13-DEC-2015

A Spatial Decision Support System for Property Valuation

11/17/2015. Learning Objectives. What Is Data Mining? Presentation. At the conclusion of this presentation, the learner will be able to:

SPATIAL DATA CLASSIFICATION AND DATA MINING

A Comparison of Leading Data Mining Tools

Web Data Mining: A Case Study. Abstract. Introduction

PRACTICAL DATA MINING IN A LARGE UTILITY COMPANY

Introduction to Data Mining

The University of Jordan

Nagarjuna College Of

Data Mining: Overview. What is Data Mining?

5.5 Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall. Figure 5-2

Curriculum Vitae. Zhenchang Xing

An Introduction to Data Mining

Title. Introduction to Data Mining. Dr Arulsivanathan Naidoo Statistics South Africa. OECD Conference Cape Town 8-10 December 2010.

Lecture: Mon 13:30 14:50 Fri 9:00-10:20 ( LTH, Lift 27-28) Lab: Fri 12:00-12:50 (Rm. 4116)

A Review of Data Mining Techniques

oct 03 / 2013 nov 12 / oct 05 / oct 07 / oct 21 / oct 24 / nov 07 / 2013 nov 14 / 2013.

Wide Area Persistent Scatterer Interferometry: Algorithms and Examples

How To Use Data Mining For Knowledge Management In Technology Enhanced Learning

Contrasting Xpriori Insight with Traditional Statistical Analysis, Data Mining and Online Analytical Processing

Mobile Phone APP Software Browsing Behavior using Clustering Analysis

Data Mining: An Introduction

from Larson Text By Susan Miertschin

Data Mining and Neural Networks in Stata

ROYAL REHAB COLLEGE AND THE ENTOURAGE EDUCATION GROUP. UPDATED SCHEDULE OF VET UNITS OF STUDY AND VET TUITION FEES Course Aug 1/2015

Ashley Institute of Training Schedule of VET Tuition Fees 2015

Clustering. Adrian Groza. Department of Computer Science Technical University of Cluj-Napoca

Big Data Mining Services and Knowledge Discovery Applications on Clouds

A!Team!Cymru!EIS!Report:!Growing!Exploitation!of!Small! OfCice!Routers!Creating!Serious!Risks!

1. What are the uses of statistics in data mining? Statistics is used to Estimate the complexity of a data mining problem. Suggest which data mining

Chapter 3: Cluster Analysis

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

Hexaware E-book on Predictive Analytics

Building Data Cubes and Mining Them. Jelena Jovanovic

Development of an Integrated Data Product for Hawaii Climate

DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM

KNOWLEDGE BASE DATA MINING FOR BUSINESS INTELLIGENCE

Breen Elementary School

P/T 2B: 2 nd Half of Term (8 weeks) Start: 25-AUG-2014 End: 19-OCT-2014 Start: 20-OCT-2014 End: 14-DEC-2014

Lluis Belanche + Alfredo Vellido. Intelligent Data Analysis and Data Mining

P/T 2B: 2 nd Half of Term (8 weeks) Start: 26-AUG-2013 End: 20-OCT-2013 Start: 21-OCT-2013 End: 15-DEC-2013

Big Data. Introducción. Santiago González

Three Perspectives of Data Mining

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

Transcription:

DATA MINING - SELECTED TOPICS Peter Brezany Institute for Software Science University of Vienna E-mail : brezany@par.univie.ac.at 1

MINING SPATIAL DATABASES 2

Spatial Database Systems SDBSs offer spatial data types (e.g., points, lines, regions, etc.) in their data model and query language. Important query types: region queries, nearest neighbour queries, and spatial joins. Spatial data => data related to space. The space of interest: earth surface, VLSI design, 3D model of the human brain, 3D arrangement of chains of protein molecules. 3

KDD in Spatial Databases Extraction of implicit knowledge, spatial relations, or other patterns not explicitly stored in spatial DBs. Algorithms for KDD in spatial DBs consider the relevant neighbourhood of the DB objects and their interaction with each other. A promising field with fruitful research results and many challenging issues. 4

Spatial KDD System User Controller Spatial Database SDBMS Focus Data Mining Evaluation Discovered Knowledge Domain Knowledge Knowledge Base 5

Spatial DB Operations for KDD standard operations/queries ( region queries,...) special operations/queries (see below) D north A A disjoint B A overlap B B inside A D B B east A A dist=0 B A dist=c B A dist<c B A C C southeast A 6

Methods for Spatial KDD Characterization (Generalization) - finding a high concept description from detailed data. Clustering - grouping the objects using similarity. Exploring spatial associations - discovering the rules that associate one or more spatial objects with other spatial objects. Classification - assigning objects to a given set of classes. 7

Spatial Characterization The existence of background knowledge in the form of concept hierarchies is needed. High-level precipitation concepts very dry dry moderately dry fair moderately wet wet very wet [0-0.1] (0.1-0.3] (0.3-1.0] (1.0,1.2] (1.2-2.0] (2.0-5.0] 5.0 & up A year-season-month hierarchy year spring summer autumn winter Mar. Apr. May June July Aug. Sept. Oct. Nov. Dec. Jan. Feb. 8

Spatial Characterization (cont.) extract region from precipitation-map where province = B.C. and period = spring and year = 1999 in relevance to precipitation and region moderately dry very dry dry very wet wet m.w. fair 9

Towards Parallel Spatial KDD Many real spatial DBs are getting huge and their complexity is increasing ==> more computing resources needed for KDD CERN (HEP) : 5 petabytes/year, 1750 scientists, 150 institutes, 32 countries Medical imaging: the size of a digitized slide = 7 GB, 1000 slides/day Parallel processing is needed 10

COMMERCIAL DATA MINING SYSTEMS 11

Examples of Commercial Data Mining Systems Many DM systems specialize in one data mining function, such as classification, or just one approach of a data mining function, such as decision tree classification. Other systems provide a broad spectrum of data mining functions. Below we introduce a few systems that provide multiple data mining functions and explore multiple knowledge discovery techniques. Prices: Usually above 1 mil. ATS 12

Examples of Commercial Data Mining Systems (2) Intelligent Miner (IBM) : association, classification, regression, predictive modeling, deviation detection, sequential pattern analysis, and clustering, application toolkit containing neural networks algorithms, statistical methods, data preparation tools, and data visualization tools. Enterprise Miner (SAS Institute) : association classification, regression, predictive modeling, deviation detection, and clustering, a variety of powerful statistical analysis tools, which are built based on the long history of SAS in the market of statistical analysis. MineSet (Silicon Graphics (SGI)) : a distinguishing feature is its set of robust graphics tools using powerful graphics features of SGI computers. 13

Examples of Commercial Data Mining Systems (3) Clementine (Integral Solutions (ISL)) : A distinguishing feature of Clementine is its object-oriented, extended module interface, which allows users algorithms and utilities to be added to Clementine s visual programming environment. DBMiner (DBMiner Technology) : Multiple datamining algorithms + OLAP analysis. There are many other commercial data mining products, systems and research prototypes that are also fast evolving. 14