DEPARTMENT OF COMPUTER APPLICATIONS QUESTION BANK MC7403 DATA WAREHOUSING AND DATA MINING

Similar documents
Data Mining: Concepts and Techniques. Jiawei Han. Micheline Kamber. Simon Fräser University К MORGAN KAUFMANN PUBLISHERS. AN IMPRINT OF Elsevier

1. What are the uses of statistics in data mining? Statistics is used to Estimate the complexity of a data mining problem. Suggest which data mining

CS1011: DATA WAREHOUSING AND MINING TWO MARKS QUESTIONS AND ANSWERS

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

Subject Description Form

Introduction to Data Mining

Chapter 20: Data Analysis

Data Mining Part 5. Prediction

An Overview of Knowledge Discovery Database and Data mining Techniques

Information Management course

The basic data mining algorithms introduced may be enhanced in a number of ways.

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1

Data Warehousing and Data Mining. A.A Datawarehousing & Datamining 1

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland

(b) How data mining is different from knowledge discovery in databases (KDD)? Explain.

International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April ISSN

Data Mining + Business Intelligence. Integration, Design and Implementation

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

Introduction to Data Mining

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

Chapter 7. Cluster Analysis

DATA WAREHOUSING AND OLAP TECHNOLOGY

Analytics on Big Data

DATA WAREHOUSE E KNOWLEDGE DISCOVERY

Business Intelligence. Data Mining and Optimization for Decision Making

Practical Applications of DATA MINING. Sang C Suh Texas A&M University Commerce JONES & BARTLETT LEARNING

ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING)

Data Preprocessing. Week 2

2074 : Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000

Distance Learning and Examining Systems

Fluency With Information Technology CSE100/IMT100

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Principles of Data Mining by Hand&Mannila&Smyth

from Larson Text By Susan Miertschin

Prediction of Heart Disease Using Naïve Bayes Algorithm

DATA MINING CONCEPTS AND TECHNIQUES. Marek Maurizio E-commerce, winter 2011

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

Data Warehousing and Data Mining in Business Applications

Association Rule Mining: A Survey

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH

Contents. Dedication List of Figures List of Tables. Acknowledgments

Week 13: Data Warehousing. Warehousing

A Data Mining Tutorial

Specific Usage of Visual Data Analysis Techniques

3/17/2009. Knowledge Management BIKM eclassifier Integrated BIKM Tools

Clustering. Data Mining. Abraham Otero. Data Mining. Agenda

Data Mining: Overview. What is Data Mining?

Building Data Cubes and Mining Them. Jelena Jovanovic

Application of Data Warehouse and Data Mining. in Construction Management

OLAP & DATA MINING CS561-SPRING 2012 WPI, MOHAMED ELTABAKH

European Archival Records and Knowledge Preservation Database Archiving in the E-ARK Project

Classification algorithm in Data mining: An Overview

Data Warehousing and Data Mining

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. ~ Spring~r

Reference Books. Data Mining. Supervised vs. Unsupervised Learning. Classification: Definition. Classification k-nearest neighbors

Data Mining Analytics for Business Intelligence and Decision Support

BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts

CS 6220: Data Mining Techniques Course Project Description

Data Mining for Knowledge Management. Classification

Knowledge Discovery in Data with FIT-Miner

A Study of Web Log Analysis Using Clustering Techniques

Unsupervised Data Mining (Clustering)

Data W a Ware r house house and and OLAP Week 5 1

Social Media Mining. Data Mining Essentials

The Data Mining Process

Introduction to Data Mining Techniques

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

CLUSTERING AND PREDICTIVE MODELING: AN ENSEMBLE APPROACH

Data Warehousing and Data Mining

Integrated Data Mining and Knowledge Discovery Techniques in ERP

Finding Frequent Patterns Based On Quantitative Binary Attributes Using FP-Growth Algorithm

M Designing and Implementing OLAP Solutions Using Microsoft SQL Server Day Course

OLAP Theory-English version

DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS

Customer Classification And Prediction Based On Data Mining Technique

META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING

Oracle Data Mining. Concepts 11g Release 2 (11.2) E

Azure Machine Learning, SQL Data Mining and R

Data Warehousing Concepts

An Introduction to Data Mining

Oracle Data Mining. Concepts 11g Release 2 (11.2) E

Knowledge Discovery and Data Mining

Université de Montpellier 2 Hugo Alatrista-Salas : hugo.alatrista-salas@teledetection.fr

How To Identify A Churner

Cluster Analysis. Isabel M. Rodrigues. Lisboa, Instituto Superior Técnico

Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup

DATA MINING AND REPORTING IN HEALTHCARE

Clustering. Adrian Groza. Department of Computer Science Technical University of Cluj-Napoca

UNSUPERVISED MACHINE LEARNING TECHNIQUES IN GENOMICS

Delivering Business Intelligence With Microsoft SQL Server 2005 or 2008 HDT922 Five Days

Data Exploration and Preprocessing. Data Mining and Text Mining (UIC Politecnico di Milano)

Data Mining for Customer Service Support. Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin

New Approach of Computing Data Cubes in Data Warehousing

Data Mining Algorithms Part 1. Dejan Sarka

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Data Warehousing and OLAP

Course DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Transcription:

UNIT-I: DATA WAREHOUSE PART A 1. Define: Data Warehouse 2. Write down the applications of data warehousing. 3. Define: data mart 4. What are the advantages of data warehousing. 5. What is the difference between operational database and data warehouse? 6. What are the four different views of data warehouse design? 7. What is the difference between fact table and dimension table? 8. What is the difference between base cuboid and apex cuboid? 9. Define: star schema. 10. Define: Snow-flake schema. 11. Define: Fact Constellation. 12. What is the use of DMQL? 13. What are the three categories of measures of data cube? 14. What is meant by OLAP? 15. Write down the OLAP operations. 16. Difference between OLAP and OLTP. 17. What are the three data warehouse models? 18. What are the different data warehouse back-end tools? 19. What are the different data OLAP server architectures? 20. What is the use of bit-map index in OLAP data? 1. Explain the architecture of the data warehouse. 2. Discuss the typical OLAP operations with examples. 3. Explain multi-dimensional data model. 4. Explain the various schemas can be used for the construction of data warehouse. 5. Discuss in detail about the data cube. 6. Distinguish the difference between OLTP and OLAP. 7. Discuss in detail about various indexing techniques used in OLAP data. 8. Explain about data warehousing tools and metadata repository. 9. Explain the three different schema construction with necessary diagrams. 10. Discuss in detail about (i) design of data warehouse analysis framework (ii) data warehouse design process.

UNIT-II: DATA MINING & DATA PREPROCESSING PART A 1. State why the data preprocessing an important issue for data warehousing and data mining. 2. What is the need for Discretization in data mining? 3. What are the various forms of data preprocessing? 4. What is concept Hierarchy? Give an example. 5. What are the major tasks in data preprocessing? 6. What are the different multi-dimensional measures of data quality? 7. What are the different measures in measuring central tendency? 8. What are the different measures in measuring dispersion of data? 9. What is KDD? 10. Define: Noisy data. 11. What are the different methods to handle noisy data? 12. Explain the two different binning methods. 13. Define: Data Integration. 14. What are the different data transformation techniques available for preprocessing? 15. Define: Data Mining. 16. What do data mining functionalities include? 17. Give the Different data formats for data mining. 18. Explain the purpose of data reduction. 19. What do you mean by segmentation in Discretization of Numerical value? 20. Describe the inter attribute and intra attribute Relationship. 1. Explain the steps involved in KDD process with necessary diagram. 2. With neat sketch, explain the components of data mining system. 3. Discuss about different types of data and data mining functionalities 4. Write different approaches to data transformation. Explain various methods of data cleaning with examples 5. Explain about data reduction strategies. 6. Explain the primitives of data mining task. 7. Explain about data discretization. 8. How data mining system can be integrated with a data warehouse? Discuss with an example. 9. Explain about data cube aggregation. 10. Explain about concept hierarchy.

UNIT-III: ASSOCIATION RULE MINING 1. Define: Frequent Pattern. 2. Write the applications of frequent pattern analysis. 3. Write any four reasons for frequent pattern mining. 4. What is the difference between support and confidence? 5. What is the difference between closed patterns and max-patterns? 6. Define: Apriori pruning principle. 7. What are the different techniques that can be applied to improve Apriori algorithm? 8. What are the reasons for bottleneck of frequent-pattern mining? 9. What are the various mining association rules? 10. Give example for single-dimensional and multi-dimensional association rules. 11. What is mining quantitative associations? 12. Define: knowledge-type constraint. Give an example. 13. Define: data constraint. Give an example. 14. Define: rule constraint. Give an example. 15. Define: interestingness constraint. Give an example. 16. Distinguish the difference between constrained mining and constrained-based mining. 17. Define: Anti Monotonicity with an example. 18. Define: Monotonicity with an example. 19. Define: Succinctness. 20. What are scalable frequent pattern mining methods? 1. Explain in detail about frequent pattern analysis. 2. Write Apriori algorithm and apply the same to identify the frequent itemsets for the following: [without candidate generation]. Minimum support level is 30%. TID 10 20 30 40 Items A, C, D B, C, E A, B, C, E B, E 3. Identify the frequent itemsets for the following, without candidate generation: TID 10 20 30 40 Items A, C, D B, C, E A, B, C, E B, E 4. Explain about mining frequent patterns using FP-trees. 5. Explain about constraint-based association mining. 6. Describe the multi-dimensional association rule, giving a suitable example. 7. Discuss the approaches for mining multi-level association rules from the transactional databases. Give relevant example. 8. A database has four transactions. Let min sup=60% and min conf=80%. TID T100 T200 T300 T400 DATE 10/15/07 10/15/07 10/19/07 10/22/07 ITEMS_BOUGHT {K,A,B} {D,A,C,E,B} {C,A,B,E} {B,A,D} Find all frequent item sets using Apriori and FP growth, respectively. Compare the efficiency of the two mining process. 9. Explain the algorithm for finding frequent itemsets without candidate generation. 10. Explain about rule-based association rule mining.

UNIT-IV: CLASSIFICATION & PREDICTION 1. What is the difference between classification and prediction? 2. What are the two steps in classification process? 3. Distinguish the difference between supervised learning and unsupervised learning. 4. What are the various evaluation issues in classification methods? 5. What is the difference between lazy learning and eager learning? 6. List out the major strength of decision tree method. 7. Why the Naïve Bayes Classifier called so? 8. What is tree pruning in decision tree induction? 9. What is support vector machine? 10. State the advantages of the decision tree approach over other approaches for performing classification. 11. State: Bayesian Theorem. 12. What is the advantage of Bayesian Belief Network? 13. What is rule-based classification? Give an example. 14. Define: Classification by back propagation. 15. What are the strengths of neural network classifier? 16. What are the weaknesses of neural network classifier? 17. State few differences between SVM and neural network. 18. What is associative classification? 19. What is ensemble method? 20. List popular ensemble methods? 1. Decision tree induction is a popular classification method. Taking one typical decision tree induction algorithm briefly outline the method of decision tree classification 2. Explain the algorithm for constructing a decision tree from training samples 3. Develop an algorithm for classification using Bayesian classification. Illustrate the algorithm with a relevant example. 4. Discuss about Bayesian classification and explain how Bayesian approach applied on Datasets. 5. What is a neural network? Explain back propagation algorithm. 6. Explain the method of ID3 decision tree classification algorithm with an example 7. Explain in detail about SVM. 8. Explain about rule-based classification. 9. Explain about classification by back propagation. 10. Explain about neural network.

UNIT-V: CLUSTERING 1. What do you mean by cluster analysis? 2. List some application areas of clustering. 3. What are the requirements of clustering? 4. What are the types of data in Cluster analysis? 5. Give categories of major clustering approaches. 6. Define: Centroid of a cluster. 7. Define: Radius of a cluster. 8. Define: Diameter of a cluster. 9. Elucidate the difference between k-means and k-medoid. 10. Mention the advantages of hierarchical clustering. 11. Define: partitioning methods. 12. List the various density-based methods. 13. What are the advantages of hierarchical clustering? 14. List the requirements of clustering in data mining. 15. Define: Euclidean distance. 16. List the model-based clustering methods. 17. What is nominal variable? 18. What are vector objects? Mention is application areas. 19. What do you mean by binary variable? 20. What is an outlier? 1. Explain the different requirements of clustering. 2. Explain the different types of data in cluster analysis. 3. Describe k-means clustering with an example. 4. Describe in detail about k-medoids, CLARA and CLARANS methods. 5. Explain in detail the density-based methods of clustering. 6. Explain the model-based clustering methods. 7. Explain the grid-based clustering methods. 8. With relevant example discuss constraint-based cluster analysis. 9. Describe the working of PAM (Partitioning Around Medoids) algorithm. 10. Describe clustering high-dimensional data.