Valliammai Engineering College

Similar documents
Data Mining: Concepts and Techniques. Jiawei Han. Micheline Kamber. Simon Fräser University К MORGAN KAUFMANN PUBLISHERS. AN IMPRINT OF Elsevier

1. What are the uses of statistics in data mining? Statistics is used to Estimate the complexity of a data mining problem. Suggest which data mining

Introduction to Data Mining

Information Management course

Introduction to Data Mining

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1

DATA WAREHOUSING AND OLAP TECHNOLOGY

DATA WAREHOUSE E KNOWLEDGE DISCOVERY

14. Data Warehousing & Data Mining

Course DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

OLAP & DATA MINING CS561-SPRING 2012 WPI, MOHAMED ELTABAKH

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Fluency With Information Technology CSE100/IMT100

Week 3 lecture slides

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

IST722 Data Warehousing

Data Warehousing and Data Mining in Business Applications

(b) How data mining is different from knowledge discovery in databases (KDD)? Explain.

Hybrid OLAP, An Introduction

Data Warehousing Concepts

3/17/2009. Knowledge Management BIKM eclassifier Integrated BIKM Tools

Data Warehouse Design

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

Chapter 20: Data Analysis

New Approach of Computing Data Cubes in Data Warehousing

Main Memory & Near Main Memory OLAP Databases. Wo Shun Luk Professor of Computing Science Simon Fraser University

VALLIAMMAI ENGNIEERING COLLEGE SRM Nagar, Kattankulathur

Subject Description Form

Data Warehousing and Data Mining

CS1011: DATA WAREHOUSING AND MINING TWO MARKS QUESTIONS AND ANSWERS

Principles of Data Mining by Hand&Mannila&Smyth

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 16 - Data Warehousing

Part 22. Data Warehousing

European Archival Records and Knowledge Preservation Database Archiving in the E-ARK Project

OLAP Theory-English version

Data Warehouse: Introduction

Introduction. A. Bellaachia Page: 1

Customer Classification And Prediction Based On Data Mining Technique

Data Mining. Concepts, Models, Methods, and Algorithms. 2nd Edition

SQL Server Analysis Services Complete Practical & Real-time Training

2074 : Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000

Oracle9i Data Warehouse Review. Robert F. Edwards Dulcian, Inc.

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

Data W a Ware r house house and and OLAP II Week 6 1

An Overview of Knowledge Discovery Database and Data mining Techniques

Class 10. Data Mining and Artificial Intelligence. Data Mining. We are in the 21 st century So where are the robots?

International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April ISSN

BUILDING BLOCKS OF DATAWAREHOUSE. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT

Data Mining and Database Systems: Where is the Intersection?

CHAPTER-24 Mining Spatial Databases

DATA MINING CONCEPTS AND TECHNIQUES. Marek Maurizio E-commerce, winter 2011

Data Mining + Business Intelligence. Integration, Design and Implementation

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 15 - Data Warehousing: Cubes

1. OLAP is an acronym for a. Online Analytical Processing b. Online Analysis Process c. Online Arithmetic Processing d. Object Linking and Processing

The Scientific Data Mining Process

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH

When to consider OLAP?

Data warehousing with PostgreSQL

DATA MINING TECHNIQUES AND APPLICATIONS

SPATIAL DATA CLASSIFICATION AND DATA MINING

Data Warehousing and Data Mining

M Designing and Implementing OLAP Solutions Using Microsoft SQL Server Day Course

(Week 10) A04. Information System for CRM. Electronic Commerce Marketing

Data Mining - Introduction

Distance Learning and Examining Systems

Topics in basic DBMS course

Two-Phase Data Warehouse Optimized for Data Mining

Database Management System Dr. S. Srinath Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No.

DATA MINING AND REPORTING IN HEALTHCARE

Database Marketing, Business Intelligence and Knowledge Discovery

Practical Applications of DATA MINING. Sang C Suh Texas A&M University Commerce JONES & BARTLETT LEARNING

The basic data mining algorithms introduced may be enhanced in a number of ways.

Tutorials for Project on Building a Business Analytic Model Using Data Mining Tool and Data Warehouse and OLAP Cubes IST 734

DATA MINING AND WAREHOUSING CONCEPTS

Data Mining Solutions for the Business Environment

Overview. Background. Data Mining Analytics for Business Intelligence and Decision Support

Lecture Data Warehouse Systems

Data Mining. Vera Goebel. Department of Informatics, University of Oslo

CS2032 Data warehousing and Data Mining Unit II Page 1

OLAP and OLTP. AMIT KUMAR BINDAL Associate Professor M M U MULLANA

Data warehouse Architectures and processes

Business Intelligence. Data Mining and Optimization for Decision Making

Introduction. Introduction. Spatial Data Mining: Definition WHAT S THE DIFFERENCE?

Data Mining. 1 Introduction 2 Data Mining methods. Alfred Holl Data Mining 1

Comparison of Data Mining Techniques used for Financial Data Analysis

LITERATURE SURVEY ON DATA WAREHOUSE AND ITS TECHNIQUES

Data Mining as Part of Knowledge Discovery in Databases (KDD)

Databases in Organizations

Analytics on Big Data

Chapter 3 Data Warehouse - technological growth

On-Line Application Processing. Warehousing Data Cubes Data Mining

Concept and Applications of Data Mining. Week 1

Application of Data Warehouse and Data Mining. in Construction Management

The University of Jordan

Dimensional Modeling for Data Warehouse

Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence

Classification and Prediction

Week 13: Data Warehousing. Warehousing

Transcription:

Valliammai Engineering College DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING QUESTION BANK Sub Code :CS2032 Data Warehousing and Data Mining Branch / Year : CSE III sem / II Year Staff in charge: Ms.C.Pabitha UNIT-I DATA WAREHOUSING 1. Define the term Data Warehouse. 2. Write down the applications of data warehousing. 3. When is data mart appropriate? 4. List out the functionality of metadata. 5. What are nine decision in the design of a Data warehousing? 6. List out the two different types of reporting tools. 7. Why data mining is used in all organizations. 8. What are the technical issues to be considered when designing and implementing a data warehouse environment? 9. List out some of the examples of access tools. 10. What are the advantages of data warehousing. 11. Give the difference between the Horizontal and Vertical Parallelism. 12. Draw a neat diagram for the Distributed memory shared disk architecture. 13. Define star schema. 14. What are the reasons to achieve very good performance by SYBASE IQ technology? 15. What are the steps to be followed to store the external source into the data warehouse? 16. Define Legacy data. 17. Draw the standard framework for metadata interchange. 18. List out the five main groups of access tools. 19. Define Data Visualization. 20. What are the various forms of data preprocessing? 1. Enumerate the building blocks of data warehouse. Explain the importance of metadata in a data warehouse environment. 2. Explain various methods of data cleaning in detail

3. Diagrammatically illustrate and discuss the data warehousing architecture with briefly explain components of data warehouse 4. (i) Distinguish between Data warehousing and data mining. (ii)describe in detail about data extraction, cleanup 5. Write short notes on (i)transformation (ii)metadata 6. List and discuss the steps involved in mapping the data warehouse to a multiprocessor architecture. 7. Discuss in detail about Bitmapped Indexing 8. Explain in detail about different Vendor Solutions. 9. Explain Various group of Acess tool. 10.Explain indexing UNIT-II BUSINESS ANALYSIS 1. Difference between OLAP and OLTP. 2. Classify OLAP tools. 3. What is meant by OLAP? 4. Difference between OLAP & OLTP 5. Define Concept Hierarchy. 6. List out the five categories of decision support tools. 7. Define Cognos Impromptu 8. List out any 5 OLAP guidelines. 9. Distinguish between multidimensional and multi-relational OLAP. 10. Define ROLAP. 11. Draw a neat diagram for the web processing model. 12. Define MQE. 13. Draw a neat sketch for three-tired client/server architecture. 14. List out the applications that the organizations uses to build a query and reporting environment for the data warehouse. 15. Distinguish between window painter and data windows painter. 16. Define ADF, SGF and DEF. 17. What is the function of power play administrator? 18. What are the products of pilot software? 19. What are the FOCUS fusion components?

20. What is cognos power play? 1. Discuss the typical OLAP operations with an example. 2. List and discuss the basic features that are provided by reporting and query tools used for business analysis. 3. Describe in detail about Cognos Impromptu 4. Explain about OLAP in detail. 5. With relevant examples discuss multidimensional online analytical processing and multirelational online analytical processing. 6. Discuss about the OLAP tools and the Internet 7. Explain Multidimensional Data model. 8.Discuss how computations can be performed efficiently on data cubes. 9.Write notes on MOLAP,ROLAP 10.Describe in detail about IBI FOCUS, Pilot software UNIT-III DATA MINING 1. Define data. 2. State why the data preprocessing an important issue for data warehousing and data mining. 3. What is the need for discretization in data mining?. 4. What are the various forms of data preprocessing? 5. What is concept Hierarchy? Give an example. 6. What are the various forms of data preprocessing? 7. Mention the various tasks to be accomplished as part of data pre-processing. 8. Define Data Mining. 9. List out any four data mining tools. 10. What do data mining functionalities include? 11. Define patterns. 12. Define cluster Analysis 13. What is Outlier Analysis? 14. What makes the pattern interesting? 15. Difference between OLAP and Datamining 16. What do you mean by high performance data mining?

17. What are the Various data mining techniques? 18. What do you mean by predictive datamining? 19. What do you mean by descriptive data mining? 20. What are the steps involved in the data mining process? 1. (i) Explain the various primitives for specifying Data mining Task. (ii) Describe the various descriptive statistical measures for data mining. 2. Discuss about different types of data and functionalities. 3. (i)describe in detail about Interestingness of patterns. (ii)explain in detail about data mining task primitives. 4. (i)discuss about different Issues of data mining. 5. (ii)explain in detail about data preprocessing. How data mining system are classified? Discuss each classification with an example. 6. How data mining system can be integrated with a data warehouse? Discuss with an example. 7. Explain data mining applications for Biomedical and DNA data analysis 8. Explain data mining applications for financial data analysis. 9. Explain data mining applications for retail industry. 10. Explain data mining applications for Telecommunication industry. UNIT-IV ASSOCIATION RULE AND CLASSIFICATION Part A 1. What is meant by market Basket analysis? 2. What is the use of multilevel association rules? 3. What is meant by pruning in a decision tree induction? 4. Write the two measures of Association Rule. 5. With an example explain correlation analysis. 6. Define conditional pattern base. 7. List out the major strength of decision tree method. 8. In classification trees, what are the surrogate splits, and how are they used? 9. The Naïve Bayes classifier makes what assumptions that motivate its name? 10. What is the frequent item set property? 11. List out the major strength of the decision tree Induction.

12. Write the two measures of association rule. 13. How are association rules mined from large databases? 14. What is tree pruning in decision tree induction? 15. What is the use of multi level association rules? 16. What are the Apriori properties used in the Apriori algorithms? 17. How is predication different from classification? 18. What is a support vector machine? 19. What are the means to improve the performance of association rule mining algorithm? 20. State the advantages of the decision tree approach over other approaches for performing classification. 1. Decision tree induction is a popular classification method. Taking one typical decision tree induction algorithm, briefly outline the method of decision tree classification. 2. Consider the following training dataset and the original decision tree induction algorithm (ID3). Risk is the class label attribute. The Height values have been already discredited into disjoint ranges. Calculate the information gain if Gender is chosen as the test attribute. Calculate the information gain if Height is chosen as the test attribute. Draw the final decision tree (without any pruning) for the training dataset. Generate all the IF-THEN rules from the decision tree. Gender Height Risk F (1.5, 1.6) Low M (1.9, 2.0) High F (1.8, 1.9) Medium F (1.8, 1.9) Medium F (1.6, 1.7) Low M (1.8, 1.9) Medium F (1.5, 1.6) Low M (1.6, 1.7) Low M (2.0, 8) High M (2.0, 8) High F (1.7, 1.8) Medium M (1.9, 2.0) Medium F (1.8, 1.9) Medium F (1.7, 1.8) Medium F (1.7, 1.8) Medium (a) Given the following transactional database 1 C, B, H 2 B, F, S 3 A, F, G 4 C, B, H 5 B, F, G 6 B, E, O

(i) We want to mine all the frequent itemsets in the data using the Apriori algorithm. Assume the minimum support level is 30%. (You need to give the set of frequent item sets in L1, L2, candidate item sets in C1, C2, ) (ii) Find all the association rules that involve only B, C.H (in either left or right hand side of the rule). The minimum confidence is 70%. 3. Describe the multi-dimensional association rule, giving a suitable example. 4. (a)explain the algorithm for constructing a decision tree from training samples (b)explain Bayes theorem. 6. Develop an algorithm for classification using Bayesian classification. Illustrate the algorithm with a relevant example. 7. Discuss the approaches for mining multi level association rules from the transactional databases. Give relevant example. 8. Write and explain the algorithm for mining frequent item sets without candidate generation. Give relevant example. 9. How is attribute oriented induction implemented? Explain in detail. 10. Discuss in detail about Bayesian classification UNIT-V CLUSTERING AND APPLICATION AND TRENDS IN DATA 1. What are the requirements of clustering? 2. What are the applications of spatial data bases? 3. What is text mining? 4. Distinguish between classification and clustering. 5. Define a Spatial database. 6. List out any two various commercial data mining tools. 7. What is the objective function of K-means algorithm? 8. Mention the advantages of Hierarchical clustering. 9. Distinguish between classification and clustering. 10. List the requirements of clustering in data mining. 11. What is web usage mining? 12. What are the requirements of clustering? 13. What are the applications of spatial databases? 14. What is text mining?

15. What is cluster analysis? 16. What are the two data structures in cluster analysis? 17. What is an outlier? Give example. 18. What is audio data mining? 19. List two application of data mining. 20. What is multimedia data mining. 1. BIRCH and CLARANS are two interesting clustering algorithms that perform effective clustering in large data sets. (i) Outline how BIRCH performs clustering in large data sets. (ii) Compare and outline the major differences of the two scalable clustering algorithms BIRCH and CLARANS. 2. Write a short note on web mining taxonomy. Explain the different activities of text mining. 3. Discuss and elaborate the current trends in data mining. 4. Discuss spatial data bases and Text databases 5. What is a multimedia database? Explain the methods of mining multimedia database? 6. (a) Explain the following clustering methods in detail. (i) BIRCH (ii) CURE 7. Discuss in detail about any four data mining applications. 8. Write short notes on (i) Partitioning methods (ii) Outlier analysis 9. Describe K means clustering with an example. 10. Describe in detail about Hierarchical methods.