A Journey from DBMS to Data Mining

Size: px
Start display at page:

Download "A Journey from DBMS to Data Mining"

Transcription

1 A Journey from DBMS to Data Mining Aditya Bagchi Short-Term Training Programme on Knowledge Discovery in Databases (DInK 10) Indian Statistical Institute, Kolkata January 11-15, 15, 2010

2 Introduction to Database Management Systems Indian Statistical Institute, Kolkata 2

3 What is a Database Management System? A Database Management System, popularly called a DBMS, manages a set of logically interconnected files that describes a problem domain. Indian Statistical Institute, Kolkata 3

4 A DBMS provides facilities to: create the database that contains the files port data to the database add and alter the database structure access and manipulate data stored in the database according to the need of the users. Indian Statistical Institute, Kolkata 4

5 Additional Facilities: recover from sudden system crashes without disturbing the content of a database. controlling the access of a large group of users in a multi-user environment. providing adequate data security arrangements so that access to different parts of a database in different modes (Read/Write/Update etc.) can be controlled for different sets of users. Indian Statistical Institute, Kolkata 5

6 Data(base) Models Serves for describing the structure of the database Does not care for the actual value Concentrates on the relationship among the data A data-model has a structure, a set of operators and a set of navigational rules. It can be designed either as an Object-based Data Model or as a Record-based Data Model. Indian Statistical Institute, Kolkata 6

7 In the Object-based Data Model, the problem domain is broken into number of real-life object types with their associated attributes and constraints. Facilities to manipulate each type of objects may also be associated with the corresponding object structures. Interconnections between different types of objects are also defined. Some of these models are not implementable but provide better understanding of a problem domain. Indian Statistical Institute, Kolkata 7

8 Entity Relationship Modeling Proposed by Peter Chen in Key component Entity Relationship Diagram (ERD) entity: identifiable object or concept of significance relationship: association between entities attribute: property of an entity (or relationship) Indian Statistical Institute, Kolkata 8

9 Entity: mutually exclusive in all cases. must be uniquely identifiable. Type: regular/strong, weak An entity set is a set of entities of the same type that share the same properties. Entity sets need not be disjoint. Indian Statistical Institute, Kolkata 9

10 An attribute is a function that maps from the entity set into a domain. Attribute: value. domain Type: simple, composite, single valued, multi valued, derived, key, null, complex A super key of an entity set is a set of one or more attributes whose values uniquely determine each entity. A candidate key of an entity set is a minimal super key Indian Statistical Institute, Kolkata 10

11 Relationship: A relationship is an association among several entities A relationship set is a mathematical relation among n 2 entities. If E 1, E 2 E n are entity sets, then a relationship set R is a subset of 2 n possible relationships. The entity sets E 1, E 2 E n participate in the relationship set R. Indian Statistical Institute, Kolkata 11

12 Degree of relationship set: number of entity sets that participate in a relationship set. Mapping cardinalities (or cardinality ratio) entities of an entity set are associated with the entities of another entity set via a relationship set. Possible relationships are: 1:1, 1:N, N:1, and N:M Indian Statistical Institute, Kolkata 12

13 Participation constraints: If every entity of E (an entity set) participate in at least one relationship in R (relationship set), is called total participation. If some entities of E participate in R, then partial participation of E in R. Indian Statistical Institute, Kolkata 13

14 multi-valued single/simple single/simple single/simple (strong) key (strong) entity composite identifying relationship single/simple derived weak entity (strong) entity role1 role2 recursive weak key (strong) key Indian Statistical Institute, Kolkata 14

15 Problem An IT company is involved in the design of software products. It has many department at various locations. Each employee of the company is posted to one department only. The following information about the employees are maintained name, address, date of birth, date of joining, designation and monthly salary. Departments are identified by a unique name. Company gets project from various organization, whose name and address are stored. Indian Statistical Institute, Kolkata 15

16 Each project is identified by a unique project number and a unique name. In addition budget, starting date, expected date of completion for each project are maintained. The company also maintains information on the number of projects where each employee is involved. Each employee may be associated with one or more projects. An employee associated with a project has a duration of service in that project and a responsibility either as a member or as the leader. A project will have only one project leader. Indian Statistical Institute, Kolkata 16

17 e-address desig e-no Employee dob doj sal p-name budget starting-dt e-name d-name location no of p posting Department duration involved in has leader o-name Project give Organization completion-dt address ER Diagram Indian Statistical Institute, Kolkata 17

18 A Record based Model describes the record types associated with a problem domain. The most popular Record type Model is the Relational Model where each record type is modeled as a 2-Dimensional Table. Specific mapping rules are available to convert an Object based design to a Record based design. Mapping of Entity / Weak Entity Types Mapping of Relationship Types Binary N-ary Mapping of Multivalued attributes. Relational model does not allow set or tuple type attributes. Indian Statistical Institute, Kolkata 18

19 For each (strong) entity type E in the ER schema, create a relation R that includes all the simple attributes of E. Choose one of the key attributes of E as the primary key for R. If the chosen key of E is composite, the set of simple attributes that form it will together form the primary key of R. For each weak entity type W in the ER schema with owner entity type E, create a relation R and include all simple attributes (or simple components of composite attributes) of W as attributes of R. In addition, include as foreign key attributes of R the primary key attribute(s) of the relation(s) that correspond to the owner entity type(s). The primary key of R is the combination of the primary key(s) of the owner(s) and the partial key of the weak entity type W, if any. Indian Statistical Institute, Kolkata 19

20 Mapping Binary 1:1 Relation Types Choose one of the relations-s, say-and include a foreign key in S the primary key of T. It is better to choose an entity type with total participation in R in the role of S. Merge the two entity types and the relationship into a single relation. Set up a third relation R for the purpose of cross-referencing the primary keys of the two relations S and T representing the entity types. Indian Statistical Institute, Kolkata 20

21 Mapping Binary 1:N Relationship Types. i. For each regular binary 1:N relationship type R, identify the relation S that represent the participating entity type at the N-side of the relationship type. ii. Include as foreign key in S the primary key of the relation T that represents the other entity type participating in R. iii. Include any simple attributes of the 1:N relation type as attributes of S. Indian Statistical Institute, Kolkata 21

22 Mapping Binary M:N Relationship Types. i. For each regular binary M:N relationship type R, create a new relation S to represent R. ii. Include as foreign key attributes in S the primary keys of the relations that represent the participating entity types; their combination will form the primary key of S. iii. Also include any simple attributes of the M:N relationship type (or simple components of composite attributes) as attributes of S. Indian Statistical Institute, Kolkata 22

23 Mapping Multivalued attributes. i. For each multivalued attribute A, create a new relation R. This relation R will include an attribute corresponding to A, plus the primary key attribute K-as a foreign key in R-of the relation that represents the entity type of relationship type that has A as an attribute. ii. The primary key of R is the combination of A and K. If the multivalued attribute is composite, we include its simple components. Indian Statistical Institute, Kolkata 23

24 Mapping N-ary N Relationship Types. For each n-ary relationship type R, where n>2, create a new relationship S to represent R. Include as foreign key attributes in S the primary keys of the relations that represent the participating entity types. Also include any simple attributes of the n-ary relationship type (or simple components of composite attributes) as attributes of S. Indian Statistical Institute, Kolkata 24

25 Er to Relational Mapping: Employee (emp-no, emp-name, emp-address, dob, doj, desig, sal, dept-name) Department(dept-name, location) Project (proj-name, proj-budget, starting-dt, proj-duration, dept-name, o-name) Involvement (emp-no, proj-name, duration, responsibility) Organization (o-name, address) Indian Statistical Institute, Kolkata 25

26 Operators: Unary Operators (applicable to a single relation) Binary Operators (manipulates two relations) Unary Operators: Selection(σ): Selects one or more rows or tuples of a relation. σ θ (R) θ is the set of conditions or predicates for selection Find All employees working in Accounts department and having salary greater than Rs.10000/- : σ sal>10000 dept-name= Accounts (Employee) Indian Statistical Institute, Kolkata 26

27 Projection(π): selects one or more attributes of a relation. π c (R) C is the set of attributes selected. List the name and address of all the employees: π name,address (Employee) Combination: List the name and address of all the employees working in Accounts department and having salary greater than Rs.10000/- : π name,address (σ sal>10000 dept-name= Accounts (Employee)) Indian Statistical Institute, Kolkata 27

28 Binary Operators: Natural Join( ): joins two relations by equating values of the common attributes. Find the name and address of the employees working in the Accounts department and placed in Mumbai. π name,address (σ location= Mumbai deptname= Accounts (Department Employee)) Query Language (SQL) : Select name, address From Department, Employee Where Department.location = Mumbai and Employee.dept-name = Accounts and Employee.dept-name = Department.dept-name Indian Statistical Institute, Kolkata 28

29 Multiple Joins: Find the name and address of the employees working in the Accounts department, placed in Mumbai and associated with the project DST 55/10. π name,address (σ location= Mumbai dept-name= Accounts proj-name= DST 55/10 (Department Employee Project)) Select name, address From Department, Employee, Project Where Department.location = Mumbai and Employee.dept-name = Accounts and Project.proj-name = DST 55/10 and Employee.dept-name = Department.dept-name and Employee.dept-name = Project.dept-name Indian Statistical Institute, Kolkata 29

30 Set Operators: two relations must have the same arity (same number of attributes) attributes in the corresponding positions must be of same domain. Example: (Banking Environment) Deposit (b_name, c_name, ac_no, balance) Borrow (b_name, c_name, ln_no, amount) b_name = branch name c_name = customer name ac_no = account number ln_no = loan number Indian Statistical Institute, Kolkata 30

31 List the name of customers who are depositors as well as borrowers in ISI branch. (π c_name (σ b_name= ISI (Deposit))) (π c_name (σ b_name= ISI (Borrow))) (??) (π c_name (σ b_name= ISI (Deposit Borrow))) Select c_name From Deposit Where b_name = ISI Intersection Select c_name From Borrow Where b_name = ISI Indian Statistical Institute, Kolkata 31

32 A First Visit to the World of Data Mining Indian Statistical Institute, Kolkata 32

33 Data Mining is a method of finding interesting trends or patterns in large datasets. Discovered patterns help and guide the appropriate authority in taking future decisions. So, Data Mining is regarded as a tool for Decision Support. Data Mining tools are expected to involve minimal user intervention. Since data volume is very large, efficiency and scalability are two very important criteria for data mining algorithms. Indian Statistical Institute, Kolkata 33

34 Data Mining Communities 1. Statistics : Provides the background for the algorithms. 2. Artificial Intelligence : Provides the required heuristics for machine learning/conceptual clustering. 3. Data Management : Provides the platform for storage & retrieval of raw and summary data. Indian Statistical Institute, Kolkata 34

35 A Data Mining Effort Involves: Data Collection Data Preprocessing & Feature Extraction Discovery of Patterns Visualization of data Evaluation of results. Indian Statistical Institute, Kolkata 35

36 Initial Activities 1.Data Cleaning: Data may be incomplete, noisy & inconsistent. Cleaning would identify outliers, fill in missing values and correct inconsistencies. 2.Data Integration & Transformation: Data analysis may involve data integration from different sources as in Data Warehouse. The sources may include Databases, Data cubes or flat files. Data need to be transformed or consolidated into forms suitable for mining, e.g. attribute values converted from absolute values to ranges. 3.Data Reduction: Since both data volume and attribute set may be too large, data reduction becomes necessary, e.g. removal of irrelevant and redundant attributes, generation of Summary Data etc. Indian Statistical Institute, Kolkata 36

37 Mining Activities 1.Rule Discovery: Discovery of Association rules from different features involved in a problem domain. 2.Data Clustering : Grouping based on conceptual clustering; Maximizing the intra-cluster similarity and minimizing inter-cluster similarity. 3.Data Classification : Grouping of data and placement of such data groups in a taxonomy. 4.Searching of Sequential Patterns : Discovery of patterns involved in a temporal sequence. Indian Statistical Institute, Kolkata 37

38 Knowledge Discovery from Databases Discovery of pattern among attributes of a relation for possible classification of data. Discovery of pattern among attributes of multiple relations. Discovery of pattern from temporal variation of data (discovery of pattern from a Data Warehouse) Indian Statistical Institute, Kolkata 38

39 CAEP(Classification by Aggregating Emerging Patterns) uses the method of support computation to find Emerging Patterns. Let there be two classes, C1 (buys_car = yes ) and C2 (buys_car = no ). Now, the itemset (age 25, income 20K) is a typical EP(Emerging Pattern) with support increases from 0.2% in C1 to 57.6% in C2(say), at a growth rate of 57.6/0.2 = 288. Usually equality test is done for a categorical attribute, while a membership in a range or interval is checked for a numerical attribute. EP is a multi-attribute test whose differentiating power is checked for a class membership. Differentiating power of an EP is derived from its growth rate and the support in the target class. Indian Statistical Institute, Kolkata 39

40 20K Marital Status Income K Age > 50 K Yes Married Single 40 > 40 No Yes Yes No Decision Tree on the concept buys_new_car Indian Statistical Institute, Kolkata 40

41 Discovery of Patterns from Multiple Relations Tends to join all relations to generate a large Universal Relation. Creates unnecessary repetition of data. Brings in too many attributes. Needs a massive data cleaning and reduction effort before applying any mining algorithm. Indian Statistical Institute, Kolkata 41

42 Discovery of pattern from temporal variation of data Data in an operational database varies over time. Temporally invariant data is stored in a Data Warehouse. Temporal Patterns can be discovered from such Data Warehouses. Important in long term planning, study of social and economic changes etc. Indian Statistical Institute, Kolkata 42

43 Reference Fundamentals of Database Systems R. Elmasri and S. B. Navathe Database System Concepts A. Silberschatz, H. F. Korth and S. Sudarshan Database Management System R. Ramakrishnan and J. Gehrke Data Mining : Concepts and Techniques J. Han & M. Kamber Indian Statistical Institute, Kolkata 43

44 Thank You Indian Statistical Institute, Kolkata 44

Chapter 7 Data Modeling Using the Entity- Relationship (ER) Model

Chapter 7 Data Modeling Using the Entity- Relationship (ER) Model Chapter 7 Data Modeling Using the Entity- Relationship (ER) Model Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 7 Outline Using High-Level Conceptual Data Models for

More information

www.gr8ambitionz.com

www.gr8ambitionz.com Data Base Management Systems (DBMS) Study Material (Objective Type questions with Answers) Shared by Akhil Arora Powered by www. your A to Z competitive exam guide Database Objective type questions Q.1

More information

Database Design. Marta Jakubowska-Sobczak IT/ADC based on slides prepared by Paula Figueiredo, IT/DB

Database Design. Marta Jakubowska-Sobczak IT/ADC based on slides prepared by Paula Figueiredo, IT/DB Marta Jakubowska-Sobczak IT/ADC based on slides prepared by Paula Figueiredo, IT/DB Outline Database concepts Conceptual Design Logical Design Communicating with the RDBMS 2 Some concepts Database: an

More information

Scheme G. Sample Test Paper-I

Scheme G. Sample Test Paper-I Scheme G Sample Test Paper-I Course Name : Computer Engineering Group Course Code : CO/CM/IF/CD/CW Marks : 25 Hours: 1 Hrs. Q.1 Attempt Any THREE. 09 Marks a) List any six applications of DBMS. b) Define

More information

Lesson 8: Introduction to Databases E-R Data Modeling

Lesson 8: Introduction to Databases E-R Data Modeling Lesson 8: Introduction to Databases E-R Data Modeling Contents Introduction to Databases Abstraction, Schemas, and Views Data Models Database Management System (DBMS) Components Entity Relationship Data

More information

IT2305 Database Systems I (Compulsory)

IT2305 Database Systems I (Compulsory) Database Systems I (Compulsory) INTRODUCTION This is one of the 4 modules designed for Semester 2 of Bachelor of Information Technology Degree program. CREDITS: 04 LEARNING OUTCOMES On completion of this

More information

Foundations of Information Management

Foundations of Information Management Foundations of Information Management - WS 2012/13 - Juniorprofessor Alexander Markowetz Bonn Aachen International Center for Information Technology (B-IT) Data & Databases Data: Simple information Database:

More information

Databases and BigData

Databases and BigData Eduardo Cunha de Almeida eduardo.almeida@uni.lu Outline of the course Introduction Database Systems (E. Almeida) Distributed Hash Tables and P2P (C. Cassagnes) NewSQL (D. Kim and J. Meira) NoSQL (D. Kim)

More information

Foundations of Information Management

Foundations of Information Management Foundations of Information Management - WS 2009/10 Juniorprofessor Alexander Markowetz Bonn Aachen International Center for Information Technology (B-IT) Alexander Markowetz Born 1976 in Brussels, Belgium

More information

Chapter 2: Entity-Relationship Model. Entity Sets. " Example: specific person, company, event, plant

Chapter 2: Entity-Relationship Model. Entity Sets.  Example: specific person, company, event, plant Chapter 2: Entity-Relationship Model! Entity Sets! Relationship Sets! Design Issues! Mapping Constraints! Keys! E-R Diagram! Extended E-R Features! Design of an E-R Database Schema! Reduction of an E-R

More information

IT2304: Database Systems 1 (DBS 1)

IT2304: Database Systems 1 (DBS 1) : Database Systems 1 (DBS 1) (Compulsory) 1. OUTLINE OF SYLLABUS Topic Minimum number of hours Introduction to DBMS 07 Relational Data Model 03 Data manipulation using Relational Algebra 06 Data manipulation

More information

COMP 378 Database Systems Notes for Chapter 7 of Database System Concepts Database Design and the Entity-Relationship Model

COMP 378 Database Systems Notes for Chapter 7 of Database System Concepts Database Design and the Entity-Relationship Model COMP 378 Database Systems Notes for Chapter 7 of Database System Concepts Database Design and the Entity-Relationship Model The entity-relationship (E-R) model is a a data model in which information stored

More information

Chapter 2: Entity-Relationship Model

Chapter 2: Entity-Relationship Model Chapter 2: Entity-Relationship Model Entity Sets Relationship Sets Design Issues Mapping Constraints Keys E R Diagram Extended E-R Features Design of an E-R Database Schema Reduction of an E-R Schema to

More information

Mining Association Rules: A Database Perspective

Mining Association Rules: A Database Perspective IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.12, December 2008 69 Mining Association Rules: A Database Perspective Dr. Abdallah Alashqur Faculty of Information Technology

More information

DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM

DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS ISSN 2320-7345 DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM M. Mayilvaganan 1, S. Aparna 2 1 Associate

More information

ER modelling, Weak Entities, Class Hierarchies, Aggregation

ER modelling, Weak Entities, Class Hierarchies, Aggregation CS344 Database Management Systems ER modelling, Weak Entities, Class Hierarchies, Aggregation Aug 2 nd - Lecture Notes (Summary) Submitted by - N. Vishnu Teja Saurabh Saxena 09010125 09010145 (Most the

More information

Chapter 5: Logical Database Design and the Relational Model Part 2: Normalization. Introduction to Normalization. Normal Forms.

Chapter 5: Logical Database Design and the Relational Model Part 2: Normalization. Introduction to Normalization. Normal Forms. Chapter 5: Logical Database Design and the Relational Model Part 2: Normalization Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden Robert C. Nickerson ISYS

More information

The Entity-Relationship Model

The Entity-Relationship Model The Entity-Relationship Model 221 After completing this chapter, you should be able to explain the three phases of database design, Why are multiple phases useful? evaluate the significance of the Entity-Relationship

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

THE ENTITY- RELATIONSHIP (ER) MODEL CHAPTER 7 (6/E) CHAPTER 3 (5/E)

THE ENTITY- RELATIONSHIP (ER) MODEL CHAPTER 7 (6/E) CHAPTER 3 (5/E) THE ENTITY- RELATIONSHIP (ER) MODEL CHAPTER 7 (6/E) CHAPTER 3 (5/E) 2 LECTURE OUTLINE Using High-Level, Conceptual Data Models for Database Design Entity-Relationship (ER) model Popular high-level conceptual

More information

Bridge from Entity Relationship modeling to creating SQL databases, tables, & relations

Bridge from Entity Relationship modeling to creating SQL databases, tables, & relations 1 Topics for this week: 1. Good Design 2. Functional Dependencies 3. Normalization Readings for this week: 1. E&N, Ch. 10.1-10.6; 12.2 2. Quickstart, Ch. 3 3. Complete the tutorial at http://sqlcourse2.com/

More information

LiTH, Tekniska högskolan vid Linköpings universitet 1(7) IDA, Institutionen för datavetenskap Juha Takkinen 2007-05-24

LiTH, Tekniska högskolan vid Linköpings universitet 1(7) IDA, Institutionen för datavetenskap Juha Takkinen 2007-05-24 LiTH, Tekniska högskolan vid Linköpings universitet 1(7) IDA, Institutionen för datavetenskap Juha Takkinen 2007-05-24 1. A database schema is a. the state of the db b. a description of the db using a

More information

Fundamentals of Database System

Fundamentals of Database System Fundamentals of Database System Chapter 4 Normalization Fundamentals of Database Systems (Chapter 4) Page 1 Introduction To Normalization In general, the goal of a relational database design is to generate

More information

Elena Baralis, Silvia Chiusano Politecnico di Torino. Pag. 1. Physical Design. Phases of database design. Physical design: Inputs.

Elena Baralis, Silvia Chiusano Politecnico di Torino. Pag. 1. Physical Design. Phases of database design. Physical design: Inputs. Phases of database design Application requirements Conceptual design Database Management Systems Conceptual schema Logical design ER or UML Physical Design Relational tables Logical schema Physical design

More information

Customer Classification And Prediction Based On Data Mining Technique

Customer Classification And Prediction Based On Data Mining Technique Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor

More information

Foundations of Business Intelligence: Databases and Information Management

Foundations of Business Intelligence: Databases and Information Management Foundations of Business Intelligence: Databases and Information Management Content Problems of managing data resources in a traditional file environment Capabilities and value of a database management

More information

Foundations of Business Intelligence: Databases and Information Management

Foundations of Business Intelligence: Databases and Information Management Chapter 5 Foundations of Business Intelligence: Databases and Information Management 5.1 Copyright 2011 Pearson Education, Inc. Student Learning Objectives How does a relational database organize data,

More information

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP Data Warehousing and End-User Access Tools OLAP and Data Mining Accompanying growth in data warehouses is increasing demands for more powerful access tools providing advanced analytical capabilities. Key

More information

Relational Database Basics Review

Relational Database Basics Review Relational Database Basics Review IT 4153 Advanced Database J.G. Zheng Spring 2012 Overview Database approach Database system Relational model Database development 2 File Processing Approaches Based on

More information

Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data

Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data INFO 1500 Introduction to IT Fundamentals 5. Database Systems and Managing Data Resources Learning Objectives 1. Describe how the problems of managing data resources in a traditional file environment are

More information

Foundations of Business Intelligence: Databases and Information Management

Foundations of Business Intelligence: Databases and Information Management Foundations of Business Intelligence: Databases and Information Management Problem: HP s numerous systems unable to deliver the information needed for a complete picture of business operations, lack of

More information

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery Index Contents Page No. 1. Introduction 1 1.1 Related Research 2 1.2 Objective of Research Work 3 1.3 Why Data Mining is Important 3 1.4 Research Methodology 4 1.5 Research Hypothesis 4 1.6 Scope 5 2.

More information

Database Marketing, Business Intelligence and Knowledge Discovery

Database Marketing, Business Intelligence and Knowledge Discovery Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski

More information

Course Syllabus For Operations Management. Management Information Systems

Course Syllabus For Operations Management. Management Information Systems For Operations Management and Management Information Systems Department School Year First Year First Year First Year Second year Second year Second year Third year Third year Third year Third year Third

More information

Unit 2.1. Data Analysis 1 - V2.0 1. Data Analysis 1. Dr Gordon Russell, Copyright @ Napier University

Unit 2.1. Data Analysis 1 - V2.0 1. Data Analysis 1. Dr Gordon Russell, Copyright @ Napier University Data Analysis 1 Unit 2.1 Data Analysis 1 - V2.0 1 Entity Relationship Modelling Overview Database Analysis Life Cycle Components of an Entity Relationship Diagram What is a relationship? Entities, attributes,

More information

TIM 50 - Business Information Systems

TIM 50 - Business Information Systems TIM 50 - Business Information Systems Lecture 15 UC Santa Cruz March 1, 2015 The Database Approach to Data Management Database: Collection of related files containing records on people, places, or things.

More information

DATABASE MANAGEMENT SYSTEMS. Question Bank:

DATABASE MANAGEMENT SYSTEMS. Question Bank: DATABASE MANAGEMENT SYSTEMS Question Bank: UNIT 1 1. Define Database? 2. What is a DBMS? 3. What is the need for database systems? 4. Define tupule? 5. What are the responsibilities of DBA? 6. Define schema?

More information

CSC 742 Database Management Systems

CSC 742 Database Management Systems CSC 742 Database Management Systems Topic #4: Data Modeling Spring 2002 CSC 742: DBMS by Dr. Peng Ning 1 Phases of Database Design Requirement Collection/Analysis Functional Requirements Functional Analysis

More information

CHAPTER SIX DATA. Business Intelligence. 2011 The McGraw-Hill Companies, All Rights Reserved

CHAPTER SIX DATA. Business Intelligence. 2011 The McGraw-Hill Companies, All Rights Reserved CHAPTER SIX DATA Business Intelligence 2011 The McGraw-Hill Companies, All Rights Reserved 2 CHAPTER OVERVIEW SECTION 6.1 Data, Information, Databases The Business Benefits of High-Quality Information

More information

Classification and Prediction

Classification and Prediction Classification and Prediction Slides for Data Mining: Concepts and Techniques Chapter 7 Jiawei Han and Micheline Kamber Intelligent Database Systems Research Lab School of Computing Science Simon Fraser

More information

not necessarily strictly sequential feedback loops exist, i.e. may need to revisit earlier stages during a later stage

not necessarily strictly sequential feedback loops exist, i.e. may need to revisit earlier stages during a later stage Database Design Process there are six stages in the design of a database: 1. requirement analysis 2. conceptual database design 3. choice of the DBMS 4. data model mapping 5. physical design 6. implementation

More information

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH Kalinka Mihaylova Kaloyanova St. Kliment Ohridski University of Sofia, Faculty of Mathematics and Informatics Sofia 1164, Bulgaria

More information

Entity-Relationship Model

Entity-Relationship Model UNIT -2 Entity-Relationship Model Introduction to ER Model ER model is represents real world situations using concepts, which are commonly used by people. It allows defining a representation of the real

More information

We know how to query a database using SQL. A set of tables and their schemas are given Data are properly loaded

We know how to query a database using SQL. A set of tables and their schemas are given Data are properly loaded E-R Diagram Database Development We know how to query a database using SQL A set of tables and their schemas are given Data are properly loaded But, how can we develop appropriate tables and their schema

More information

Database Design Methodology

Database Design Methodology Topic 7 Database Design Methodology LEARNING OUTCOMES When you have completed this Topic you should be able to: 1. Discuss the purpose of a design methodology. 2. Explain three main phases of design methodology.

More information

City University of Hong Kong. Information on a Course offered by Department of Computer Science with effect from Semester A in 2014 / 2015

City University of Hong Kong. Information on a Course offered by Department of Computer Science with effect from Semester A in 2014 / 2015 City University of Hong Kong Information on a Course offered by Department of Computer Science with effect from Semester A in 2014 / 2015 Part I Course Title: Database Systems Course Code: CS3402 Course

More information

DATABASE DESIGN. - Developing database and information systems is performed using a development lifecycle, which consists of a series of steps.

DATABASE DESIGN. - Developing database and information systems is performed using a development lifecycle, which consists of a series of steps. DATABASE DESIGN - The ability to design databases and associated applications is critical to the success of the modern enterprise. - Database design requires understanding both the operational and business

More information

The Relational Model. Why Study the Relational Model? Relational Database: Definitions. Chapter 3

The Relational Model. Why Study the Relational Model? Relational Database: Definitions. Chapter 3 The Relational Model Chapter 3 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Why Study the Relational Model? Most widely used model. Vendors: IBM, Informix, Microsoft, Oracle, Sybase,

More information

B2.2-R3: INTRODUCTION TO DATABASE MANAGEMENT SYSTEMS

B2.2-R3: INTRODUCTION TO DATABASE MANAGEMENT SYSTEMS B2.2-R3: INTRODUCTION TO DATABASE MANAGEMENT SYSTEMS NOTE: 1. There are TWO PARTS in this Module/Paper. PART ONE contains FOUR questions and PART TWO contains FIVE questions. 2. PART ONE is to be answered

More information

Course 103402 MIS. Foundations of Business Intelligence

Course 103402 MIS. Foundations of Business Intelligence Oman College of Management and Technology Course 103402 MIS Topic 5 Foundations of Business Intelligence CS/MIS Department Organizing Data in a Traditional File Environment File organization concepts Database:

More information

COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction

COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction COMP3420: Advanced Databases and Data Mining Classification and prediction: Introduction and Decision Tree Induction Lecture outline Classification versus prediction Classification A two step process Supervised

More information

BCA. Database Management System

BCA. Database Management System BCA IV Sem Database Management System Multiple choice questions 1. A Database Management System (DBMS) is A. Collection of interrelated data B. Collection of programs to access data C. Collection of data

More information

The Entity-Relationship Model

The Entity-Relationship Model The Entity-Relationship Model Chapter 2 Slides modified by Rasmus Pagh for Database Systems, Fall 2006 IT University of Copenhagen Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Today

More information

International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 ISSN 2229-5518

International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 ISSN 2229-5518 International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 Over viewing issues of data mining with highlights of data warehousing Rushabh H. Baldaniya, Prof H.J.Baldaniya,

More information

5.5 Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall. Figure 5-2

5.5 Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall. Figure 5-2 Class Announcements TIM 50 - Business Information Systems Lecture 15 Database Assignment 2 posted Due Tuesday 5/26 UC Santa Cruz May 19, 2015 Database: Collection of related files containing records on

More information

Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT Learning Objectives

Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT Learning Objectives Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT Learning Objectives Describe how the problems of managing data resources in a traditional file environment are solved

More information

ETL Process in Data Warehouse. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT

ETL Process in Data Warehouse. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT ETL Process in Data Warehouse G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT Outline ETL Extraction Transformation Loading ETL Overview Extraction Transformation Loading ETL To get data out of

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.1 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Classification vs. Numeric Prediction Prediction Process Data Preparation Comparing Prediction Methods References Classification

More information

Fragmentation and Data Allocation in the Distributed Environments

Fragmentation and Data Allocation in the Distributed Environments Annals of the University of Craiova, Mathematics and Computer Science Series Volume 38(3), 2011, Pages 76 83 ISSN: 1223-6934, Online 2246-9958 Fragmentation and Data Allocation in the Distributed Environments

More information

Extending Data Processing Capabilities of Relational Database Management Systems.

Extending Data Processing Capabilities of Relational Database Management Systems. Extending Data Processing Capabilities of Relational Database Management Systems. Igor Wojnicki University of Missouri St. Louis Department of Mathematics and Computer Science 8001 Natural Bridge Road

More information

Data Mining Jargon. Bob Muenchen The Statistical Consulting Center

Data Mining Jargon. Bob Muenchen The Statistical Consulting Center Data Mining Jargon Bob Muenchen The Statistical Consulting Center Data mining is the automated search for useful patterns in data. It uses tools from many different disciplines, each of which uses its

More information

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing

More information

Comparison of Data Mining Techniques used for Financial Data Analysis

Comparison of Data Mining Techniques used for Financial Data Analysis Comparison of Data Mining Techniques used for Financial Data Analysis Abhijit A. Sawant 1, P. M. Chawan 2 1 Student, 2 Associate Professor, Department of Computer Technology, VJTI, Mumbai, INDIA Abstract

More information

Database Management Systems. Chapter 1

Database Management Systems. Chapter 1 Database Management Systems Chapter 1 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 2 What Is a Database/DBMS? A very large, integrated collection of data. Models real-world scenarios

More information

Application of Data Mining Methods in Health Care Databases

Application of Data Mining Methods in Health Care Databases 6 th International Conference on Applied Informatics Eger, Hungary, January 27 31, 2004. Application of Data Mining Methods in Health Care Databases Ágnes Vathy-Fogarassy Department of Mathematics and

More information

Chapter 6. Foundations of Business Intelligence: Databases and Information Management

Chapter 6. Foundations of Business Intelligence: Databases and Information Management Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:

More information

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM. DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,

More information

Chapter 1: Introduction

Chapter 1: Introduction Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition Language Data Manipulation Language Transaction Management Storage Management Database Administrator Database

More information

The Relational Model. Why Study the Relational Model? Relational Database: Definitions

The Relational Model. Why Study the Relational Model? Relational Database: Definitions The Relational Model Database Management Systems, R. Ramakrishnan and J. Gehrke 1 Why Study the Relational Model? Most widely used model. Vendors: IBM, Microsoft, Oracle, Sybase, etc. Legacy systems in

More information

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10 1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom

More information

not possible or was possible at a high cost for collecting the data.

not possible or was possible at a high cost for collecting the data. Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day

More information

A Tool for Generating Relational Database Schema from EER Diagram

A Tool for Generating Relational Database Schema from EER Diagram A Tool for Generating Relational Schema from EER Diagram Lisa Simasatitkul and Taratip Suwannasart Abstract design is an important activity in software development. EER diagram is one of diagrams, which

More information

Database Design Process. Databases - Entity-Relationship Modelling. Requirements Analysis. Database Design

Database Design Process. Databases - Entity-Relationship Modelling. Requirements Analysis. Database Design Process Databases - Entity-Relationship Modelling Ramakrishnan & Gehrke identify six main steps in designing a database Requirements Analysis Conceptual Design Logical Design Schema Refinement Physical

More information

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

More information

Relational Schema Design

Relational Schema Design Relational Schema Design Using ER Methodology to Design Relational Database Schemas The Development Process Collect requirements. Analyze the requirements. Conceptually design the data (e.g., draw an ER

More information

DATA WAREHOUSING AND OLAP TECHNOLOGY

DATA WAREHOUSING AND OLAP TECHNOLOGY DATA WAREHOUSING AND OLAP TECHNOLOGY Manya Sethi MCA Final Year Amity University, Uttar Pradesh Under Guidance of Ms. Shruti Nagpal Abstract DATA WAREHOUSING and Online Analytical Processing (OLAP) are

More information

Data Mining: Concepts and Techniques. Jiawei Han. Micheline Kamber. Simon Fräser University К MORGAN KAUFMANN PUBLISHERS. AN IMPRINT OF Elsevier

Data Mining: Concepts and Techniques. Jiawei Han. Micheline Kamber. Simon Fräser University К MORGAN KAUFMANN PUBLISHERS. AN IMPRINT OF Elsevier Data Mining: Concepts and Techniques Jiawei Han Micheline Kamber Simon Fräser University К MORGAN KAUFMANN PUBLISHERS AN IMPRINT OF Elsevier Contents Foreword Preface xix vii Chapter I Introduction I I.

More information

Course: CSC 222 Database Design and Management I (3 credits Compulsory)

Course: CSC 222 Database Design and Management I (3 credits Compulsory) Course: CSC 222 Database Design and Management I (3 credits Compulsory) Course Duration: Three hours per week for 15weeks with practical class (45 hours) As taught in 2010/2011 session Lecturer: Oladele,

More information

A Review of Data Mining Techniques

A Review of Data Mining Techniques Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,

More information

Fundamentals of Database Design

Fundamentals of Database Design Fundamentals of Database Design Zornitsa Zaharieva CERN Data Management Section - Controls Group Accelerators and Beams Department /AB-CO-DM/ 23-FEB-2005 Contents : Introduction to Databases : Main Database

More information

XV. The Entity-Relationship Model

XV. The Entity-Relationship Model XV. The Entity-Relationship Model The Entity-Relationship Model Entities, Relationships and Attributes Cardinalities, Identifiers and Generalization Documentation of E-R Diagrams and Business Rules The

More information

Foundations of Business Intelligence: Databases and Information Management

Foundations of Business Intelligence: Databases and Information Management Foundations of Business Intelligence: Databases and Information Management Wienand Omta Fabiano Dalpiaz 1 drs. ing. Wienand Omta Learning Objectives Describe how the problems of managing data resources

More information

CIS 631 Database Management Systems Sample Final Exam

CIS 631 Database Management Systems Sample Final Exam CIS 631 Database Management Systems Sample Final Exam 1. (25 points) Match the items from the left column with those in the right and place the letters in the empty slots. k 1. Single-level index files

More information

Data Analysis 1. SET08104 Database Systems. Copyright @ Napier University

Data Analysis 1. SET08104 Database Systems. Copyright @ Napier University Data Analysis 1 SET08104 Database Systems Copyright @ Napier University Entity Relationship Modelling Overview Database Analysis Life Cycle Components of an Entity Relationship Diagram What is a relationship?

More information

Database Systems. National Chiao Tung University Chun-Jen Tsai 05/30/2012

Database Systems. National Chiao Tung University Chun-Jen Tsai 05/30/2012 Database Systems National Chiao Tung University Chun-Jen Tsai 05/30/2012 Definition of a Database Database System A multidimensional data collection, internal links between its entries make the information

More information

THE OPEN UNIVERSITY OF TANZANIA FACULTY OF SCIENCE TECHNOLOGY AND ENVIRONMENTAL STUDIES BACHELOR OF SIENCE IN INFORMATION AND COMMUNICATION TECHNOLOGY

THE OPEN UNIVERSITY OF TANZANIA FACULTY OF SCIENCE TECHNOLOGY AND ENVIRONMENTAL STUDIES BACHELOR OF SIENCE IN INFORMATION AND COMMUNICATION TECHNOLOGY THE OPEN UNIVERSITY OF TANZANIA FACULTY OF SCIENCE TECHNOLOGY AND ENVIRONMENTAL STUDIES BACHELOR OF SIENCE IN INFORMATION AND COMMUNICATION TECHNOLOGY OIT 217.DATABASE CONCEPTS AND DESIGN COURSE OUTLINE

More information

14. Data Warehousing & Data Mining

14. Data Warehousing & Data Mining 14. Data Warehousing & Data Mining Data Warehousing Concepts Decision support is key for companies wanting to turn their organizational data into an information asset Data Warehouse "A subject-oriented,

More information

Database Design Overview. Conceptual Design ER Model. Entities and Entity Sets. Entity Set Representation. Keys

Database Design Overview. Conceptual Design ER Model. Entities and Entity Sets. Entity Set Representation. Keys Database Design Overview Conceptual Design. The Entity-Relationship (ER) Model CS430/630 Lecture 12 Conceptual design The Entity-Relationship (ER) Model, UML High-level, close to human thinking Semantic

More information

Data Mining: Data Preprocessing. I211: Information infrastructure II

Data Mining: Data Preprocessing. I211: Information infrastructure II Data Mining: Data Preprocessing I211: Information infrastructure II 10 What is Data? Collection of data objects and their attributes Attributes An attribute is a property or characteristic of an object

More information

ECS 165A: Introduction to Database Systems

ECS 165A: Introduction to Database Systems ECS 165A: Introduction to Database Systems Todd J. Green based on material and slides by Michael Gertz and Bertram Ludäscher Winter 2011 Dept. of Computer Science UC Davis ECS-165A WQ 11 1 1. Introduction

More information

ASSOCIATION RULE MINING ON WEB LOGS FOR EXTRACTING INTERESTING PATTERNS THROUGH WEKA TOOL

ASSOCIATION RULE MINING ON WEB LOGS FOR EXTRACTING INTERESTING PATTERNS THROUGH WEKA TOOL International Journal Of Advanced Technology In Engineering And Science Www.Ijates.Com Volume No 03, Special Issue No. 01, February 2015 ISSN (Online): 2348 7550 ASSOCIATION RULE MINING ON WEB LOGS FOR

More information

Data Mining Framework for Direct Marketing: A Case Study of Bank Marketing

Data Mining Framework for Direct Marketing: A Case Study of Bank Marketing www.ijcsi.org 198 Data Mining Framework for Direct Marketing: A Case Study of Bank Marketing Lilian Sing oei 1 and Jiayang Wang 2 1 School of Information Science and Engineering, Central South University

More information

MySQL for Beginners Ed 3

MySQL for Beginners Ed 3 Oracle University Contact Us: 1.800.529.0165 MySQL for Beginners Ed 3 Duration: 4 Days What you will learn The MySQL for Beginners course helps you learn about the world's most popular open source database.

More information

Foundations of Business Intelligence: Databases and Information Management

Foundations of Business Intelligence: Databases and Information Management Chapter 6 Foundations of Business Intelligence: Databases and Information Management 6.1 2010 by Prentice Hall LEARNING OBJECTIVES Describe how the problems of managing data resources in a traditional

More information

Chapter 6 8/12/2015. Foundations of Business Intelligence: Databases and Information Management. Problem:

Chapter 6 8/12/2015. Foundations of Business Intelligence: Databases and Information Management. Problem: Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Chapter 6 Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:

More information

Visual Data Mining in Indian Election System

Visual Data Mining in Indian Election System Visual Data Mining in Indian Election System Prof. T. M. Kodinariya Asst. Professor, Department of Computer Engineering, Atmiya Institute of Technology & Science, Rajkot Gujarat, India trupti.kodinariya@gmail.com

More information

Chapter 1: Introduction. Database Management System (DBMS) University Database Example

Chapter 1: Introduction. Database Management System (DBMS) University Database Example This image cannot currently be displayed. Chapter 1: Introduction Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Database Management System (DBMS) DBMS contains information

More information

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du fdu@cs.ubc.ca University of British Columbia

More information

Introduction to database management systems

Introduction to database management systems Introduction to database management systems Database management systems module Myself: researcher in INRIA Futurs, Ioana.Manolescu@inria.fr The course: follows (part of) the book "", Fourth Edition Abraham

More information