1. What are the uses of statistics in data mining? Statistics is used to Estimate the complexity of a data mining problem. Suggest which data mining

Size: px
Start display at page:

Download "1. What are the uses of statistics in data mining? Statistics is used to Estimate the complexity of a data mining problem. Suggest which data mining"

Transcription

1 1. What are the uses of statistics in data mining? Statistics is used to Estimate the complexity of a data mining problem. Suggest which data mining techniques are most likely to be successful, and Identify data fields that contain the most surface information. 2. What is the main goal of statistics? The basic goal of statistics is to extend knowledge about a subset of a collection to the entire collection. 3. What are the factors to be considered while selecting the sample in statistics? The sample should be Large enough to be representative of the population. Small enough to be manageable. Accessible to the sampler. Free of bias. 4. Name some advanced database systems? Object-oriented databases. Object-relational databases. 5. Name some specific application oriented databases? Spatial databases. Time-series databases. Text databases Multimedia databases. 6. Define Relational databases? Relational databases are a collection of tables, each of which is assigned a unique name. Each table consists of a set of attributes (columns or fields) and usually stores a large set of tuples (rows or records). Each tuple in a relational table represents an object identified by a unique key and described by a set of attribute values. 7. Define Transactional databases? Transactional databases consist of a file where each record represents a transaction. A transaction typically includes a unique transaction identity number (trans_id), and a list of the items making up the transaction. 1

2 8. Define Spatial Databases? Spatial databases contain spatial-related information. Such databases include geographic (map) databases, VLSI chip design databases, and medical and satellite image databases. Spatial data may be represented in raster format, consisting of n-dimensional bit maps or pixel maps. 9. What is Temporal Database? Temporal database store time related data. It usually stores relational data that include time related attributes. These attributes may involve several time stamps, each having different semantics. 10. What is a Time-Series database? A Time-Series database stores sequences of values that change with time, such as data collected regarding the stock exchange. 11. What is Legacy database? A Legacy database is a group of heterogeneous databases that combines different kinds of data systems such as relational or objects oriented databases, hierarchical databases, network databases, and spread sheets, multimedia databases or file systems. 12. What are the steps in the data mining process? Data Cleaning Data Integration Data Selection Data Transformation Data Mining Pattern Evaluation Knowledge Representation 13. Define data cleaning? Data Cleaning means removing the inconsistent data or noise and collecting necessary information. 14. Define data mining? Data mining is a process of extracting or mining knowledge from huge amount of data. 15. Define pattern evaluation? Pattern evaluation is used to identify the truly interesting patterns representing knowledge based on some interesting measures. 2

3 16. Define Knowledge representation? Knowledge representation techniques are used to present the mined knowledge to the user. 17. Define class/ concept description? Data can be associated with classes or concepts. It can be useful to describe individual classes and concepts in summarized, concise and yet precise terms. Such description of a class or a concept is called class/ concept descriptions. 18. What is Data Characterization? Data Characterization is a summarization of the general characteristics or features of a target class of data. The data corresponding to the user specified class or typically collected by a database query. 19. What is data discrimination? Data discrimination is a comparison of the general features of target class data objects with the general features of objects from one or a set of contrasting classes. 20. What is Association analysis? Association analysis is the discovery of association rules showing attribute-value conditions that occur frequently together in a given set of data. 21. Define association rules? Association rules are of the form X Y, that is A1.. Am B1 Bn, where A i (for i {1,.,m}) and B j (for j {1,,n}) are attribute- value pairs. The association rule X Y is interpreted as database tuples that satisfy the condition in X are also likely to satisfy the conditions in Y. 22. List out the major components of a typical data mining system? The major components in the typical data mining system architecture are Database, Data warehouse, World Wide Web or other information repositories. Database or data warehouse server Knowledge base Data mining engine Pattern evaluation module User interface 3

4 23. How does a data warehouse differ from a database? How are they similar? Difference: A database system or DBMS, consists of a collection of interrelated data, know as a database and a set of software programs to manage and access the data. A data warehouse is a repository of information collected from multiple sources, stored under a unified schema. Similarity: Queries can be applied to both database and data warehouse. A data warehouse is modeled by multidimensional database structure. 24. What is concept description of hierarchies? Concept description generates description for the characterization and comparison of the data. It is some times called class description, when the concept to be described refers to a class of objects. 25. What is constraint based association mining? Specification of constraints or expectations to confine the search space of database mining process called constraint based association mining. The constraints can be, Knowledge based constraints Data constraints Dimension/level constraints Interestingness constraints Rule constraints. 26. What is linear regression? Linear regression involves finding the best line to fit two attributes, so that one attribute can be used to predict the other. Example: A random variable y (response variable) can be modeled as a linear function of another random variable x (predictor variable).with the equation y = wx+b. 27. What are the two data structures in cluster analysis? Two data structures in cluster analysis are, Data matrix (object by variable structure) Dissimilarity matrix (object by object structure) 4

5 28. How are concept hierarchies useful in OLAP? In the multidimensional model, data are organized onto multiple dimensions, and each dimension contains multiple levels of abstraction defined by concept hierarchies. This organization provides users with a flexibility to view data from different perspectives. LAP provides a user-friendly environment for interactive data analysis. 29. What do you mean by virtual warehouse? A virtual warehouse is a set of views over operational databases. For effective query processing only some of the possible summary views may be materialized. A virtual warehouse is easy to build but requires excess capacity on operational database servers. 30. List out five data mining tools IBM S Intelligent miner Data mined corporation Data mined Pilots Discovery server Tools from business objects and SAS Institute End user tools. 31. What is KDD? Knowledge discovery is a process and consists of an iterative sequence of the following steps. Data cleaning Data Integration Data Selection Data transformation Data Mining Pattern evaluation Knowledge presentation 32. List out the classification of data mining system? Classification according to the kinds of databases mined. Classification according to the kinds of knowledge mined. Classification according to the techniques utilized. Classification according to the application adapted. 5

6 33. What is concept description? Concept description is a form of data generalization. A concept typically refers to a collection of data such as frequent-buyers, graduate-students etc. Concept description generates descriptions for the characterization and comparison of the data. 34. What is association rule mining? It consists of first finding frequent item sets (set of items, such as A and B, satisfies a minimum support threshold, or percentage of the task-relevant tuples) from which strong association rules in the form of A=>B are generated. The rules also satisfy a minimum confidence threshold. Associative can be further analyzed to uncover correlation rules. 35. What is tree pruning? Tree pruning is used to remove the anomalies in the training data due to noise outliers. It addresses the problem of overfilling the data. Two approaches of tree pruning. Pre-pruning --Tree is pruned by halting its construction early. Post-pruning--Removes sub trees from a fully grown tree. 36. What is cluster analysis? The process of grouping a set of physical or abstract object into classes of similar objects is called clustering. A cluster is a collection of data objects that are similar to one another within the same cluster and are dissimilar to the objects in other clusters. 37. What is concept hierarchy? Concept hierarchy defines a sequence of mapping from a set of low-level concepts to higher-level, more general concepts. Concept hierarchies are implicit within the database schema. A concept hierarchy is a total or partial order among attribute in a database schema is called schema hierarchy. 38. What is Aggregation and metadata? Aggregation, where summary or aggregation operations are applied to the data. For example, the daily sales data may be aggregated so as to compute monthly and annual total amount. This is used in constructing a data cube for analysis of data at multiple granularities. Metadata are data about data which define warehouse objects. Metadata are created for the data names and definition of the given warehouse. 6

7 39. What is star schema and snow flake schema? Star schema is the most common modeling paradigm in which the data warehouse contains A large central table (fact table) containing the bulk of data with no redundancy. A set of smaller attendant tables, one for each dimension. Snowflake schema is a variant of the star schema model where some dimension tables are normalized; thereby further splitting the data into additional tables. 40. Write short notes on spatial clustering? Spatial data clustering identifies clusters or densely populated regions, according to some distance measurements in a large, multidimensional data set. 41. State the types of Linear Model and state its use? Generalized Linear model represent the theoretical foundation on which linear regression can be applied to the modeling of categorical response variables. The types of generalized linear model are Logistic regression Poisson regression 42. What are the goals of Time series analysis? Finding patterns in the data Predicting future values. 43. What is smoothing? Smoothing is an approach that is used to remove nonsystematic behaviors found in a time series. It can be used to detect trends in time series. 44. What is Lag? The time difference between related items is referred to as Lag. 45. Write the preprocessing steps that may be applied to the data for classification and prediction? Data cleaning Relevance analysis Data transformation 46. Define Data Classification? It is a two-step process. In the first step, a model is built describing a predetermined set of data classes or concepts. The model is constructed by analyzing 7

8 database tuples described by attributes. In the second step, the model is used for classification. 47. What are Bayesian Classifiers? Bayesian Classifiers are statistical classifiers. They can predict class membership probabilities, such as the probability that a given sample belongs to a particular class. 48. What is a decision tree? It is a flowchart like tree structure, where each internal node denotes a test on an attribute, each branch represents an outcome of the test, and leaf node represents classes or class distribution. Decision tree is a predictive model. Each branch of the tree is a classification question and leaves of the tree are partition of the data set with their classification. 49. Where are Decision Trees mainly used? Used for exploration of data set and business problems Data preprocessing for other predictive analysis Statisticians use decision trees for exploratory analysis. 50. How will you solve a classification problem using Decision Tree? Decision Tree Induction: Construct a decision tree using training data. For each ti D apply the decision tree to determine its class ti-tuple D-Database 51. How is association rules mined from large databases? Association rule mining is a two step process. Find all frequent itemsets. Generate strong association rules from the frequent itemsets. 52. What is the classification of association rules based on various criteria? 1. Based on the types of values handled in the rule a. Boolean association rule b. Quantitative association rule. 2. Based on the dimensions of data involved in the rule a. Single dimensional association rule b. Multidimensional association rule 3. Based on the levels of abstractions involved in the rule 8

9 a. Single level association rule b. Multilevel association rule 4. Based on various extensions to association mining a. Maxpatterns b. Frequent closed itemsets 53. What is Apriori algorithm? Apriori algorithm is an influential algorithm for mining frequent item sets for Boolean association rules using prior knowledge. Apriori algorithm uses prior knowledge of frequent itemset properties and it employees an iterative approach known as level-wise search where k-itemsets are used to explore (k+1)-itemsets. 54. Define a Data mart? Data mart is a pragmatic collection of related facts, but does not have to be exhaustive or exclusive. A data mart is both a kind of subject area and an application. Datamart is a collection of numeric facts. 55. What is data warehouse performance issue? The performance of data warehouse is largely a function of the quantity and the type of data stored within a database and the query/data loading work load placed upon the system. 56. What is Data Inconsistency Cleaning? This can be summarized as the process of cleaning up the small inconsistencies that introduce themselves into the data. Examples include duplicate keys and unreferenced foreign keys. 57. Merits of Data warehouse. * Ability to make effective decisions from database * Better Analysis of data and decision support * Discover trends and correlations that benefits business * Handle huge amount of data 58. What are the characteristics of data warehouse? * Separate * Available * Integrated * Subject oriented * Not dynamic * Consistency 9

10 * Iterative Development * Aggregation Performance 59. List some of the data warehouse tools. * OLAP (Online Analytic Processing) * ROLAP (Relational OLAP) * End User Data Access Tool * Ad Hoc Query Tool * Data Transformation Services * Replication 60. Explain OLAP. The general activity of querying and presenting text and number data from data warehouses, as well as a specifically dimensional style of querying and presenting that is exemplified by a number of "OLAP Vendors". The OLAP vendors technology is nonrelational and is almost always biased on an explicit multidimensional cube of data. OLAP databases are also known as multidimensional cube of databases. 61. Explain ROLAP. ROLAP is a set of user interfaces and applications that give a relational database, a dimensional flavor. ROLAP stands for Relational Online Analytic Processing. 62. Explain End User Data Access Tool? End User Data Access Tool is a client of the data warehouse. In a relational data warehouse, such as client maintains a session with the presentation server, sending a stream of separate SQL requests to the server. Eventually the End User Data Access Tool is done with the SQL session and turns around to present a screen of data or a report, a graph, or some other higher form of analysis to the user. An End User Data Access Tool can be as simple as an Ad Hoc Query Tool or can be complex as a sophisticated data mining or modeling application. 63. Explain Ad Hoc Query Tool? It is a specific kind of end user data access tool that invites the user to form their own queries by directly manipulating relational tables and their joins. Ad Hoc Query Tools, as powerful as they are, can only be effectively used and understood by about 10% of all the potential end users of a data warehouse. 64. Name some of the data mining applications. * Data mining for biomedical and DNA Data Analysis * Data Mining for Financial Data Analysis 10

11 * Data Mining for the Retail Industry * Data Mining for the Telecommunication Industry 65. What are the contributions of Data Mining to DNA Analysis? * Semantic Integration of heterogeneous, distributed genome databases * Similarity Search and Comparison among DNA Sequences * Association Analysis: identification of co-occurring gene sequences * Path Analysis: Linking genes to different stages of disease development * Visualization Tools and genetic data analysis 66. Name some examples of Data Mining in Retail Industry. * Design and Construction of Data Warehouses based on the benefits of Data Mining * Multidimensional Analysis of sales, customers, products, time and region * Analysis of the effectiveness of sales campaigns * Customer retention analysis of customer loyalty * Purchase recommendation and cross-reference of item 67. What is the difference between "supervised" and "unsupervised" learning scheme? In Data Mining during classification the class label of each training sample is provided, this type of training is called "supervised learning" i.e., the learning of the model is supervised in that it is told to which class each training sample belongs. E.g. Classification In unsupervised learning the class label of each training sample is not known and the member or set of classes to be learned may not be known in advance. E.g. Clustering 68. Discuss the importance of similarity metric clustering? Why is it difficult to handle categorical data for clustering? The process of grouping a set of physical or abstract objects into classes of similar objects is called "clustering". Similarity metric is important because it is used for outlier detection. The clustering algorithm which is main memory based can operate only on the following two data structures namely, a) Data Matrix b) Dissimilarity Matrix So it is difficult to handle categorical data. 11

12 69. Mention at least 3 advantages of Bayesian Networks for data analysis. Explain each one a) Bayesian Network is a graphical representation of unknown knowledge that is easy to construct and interpret. b) The representation has formal probabilistic semantics, making it suitable for statistical manipulation c) The representation is used for encoding uncertain expert knowledge in expert systems. 70. Why do we need to prune a decision tree? Why should we use a separate pruning data set instead of pruning the tree with the training database? When a decision tree is built, many of the branches will reflect animation in the training data due to noise or outliers. Tree pruning methods are needed to address this problem of over fitting the data. 71. Explain the various OLAP operations? a) Roll-up: The roll up operation performs aggregation on a data cube, either by climbing up a concept hierarchy for a dimension. b) Drill-down: It is the reverse of roll up. It navigates from less detailed data to more detailed data. c) Slice: Performs a selection on one dimension of the given cube, resulting in a sub cube. 72. Discuss the concepts of frequent itemset, support & confidence? A set of items is referred to as itemset. An itemset that contains k items is called k-itemset. An itemset that satisfies minimum support is referred to as frequent itemset. Support is the ratio of the number of transactions that include all items in the antecedent and consequent parts of the rule to the total number of transactions. Confidence is the ratio of the number of transactions that include all items in the consequent as well as antecedent to the number of transactions that include all items in antecedent. 73. Why is data quality so important in a data warehouse environment? Data quality is important in a data warehouse environment to facilitate decision- making. In order to support decision-making, the stored data should provide information from a historical perspective and in a summarized manner. 12

13 74. How can data visualization help in decision-making? Data visualization helps the analyst gain intuition about the data being observed. Visualization applications frequently assists the analyst in selecting display formats, viewer perspective and data representation schemas that faster deep intuitive understanding thus facilitating decision-making. 75. What do you mean by high performance data mining? Data mining refers to extracting or mining knowledge. It involves an integration of techniques from multiple disciplines like database technology, statistics, machine learning, neural networks, etc. when it involves techniques from high performance computing it is referred as high performance data mining. 76. What are the merits Of Data Warehouse? The merits of data warehouse are the following Ability to make effective decisions form the database. To discover trends and correlations as they provide benefit to the business. Better analysis of data and decision support. It leads to better understanding of the business and handle huge amount of data. There is a possibility of the customer being served better. Better understanding of the business risks. Improvement of the business process. Being able to make tailor made products and services. 77. What are the merits of spatial Data Warehouse? The merits of spatial data warehouse are the following Make dynamic geographic queries on data. To aggrgate your data to geographic areas. To analyse data and spatial reorganization of it. Visualization and presentation of data. 78. Describe the two common approaches of Tree Pruning? In the pre pruning approach a tree is pruned by halting its construction early. The second approach, post pruning, removes branches from a fully grown tree. A tree node is pruned by removing its branches. 13

14 79. What is clustering? Clustering is the process of grouping the data into classes or clusters so that objects within a cluster have high similarity in comparison to one another, but are very dissimilar to objects in other clusters. 80. What are the requirements of clustering? Scalability Ability to deal with different types of attributes Ability to deal with noisy data Minimal requirements for domain knowledge to determine input parameters Constraint based clustering Interpretability and usability 81. State the categories of clustering methods? Partitioning methods Hierarchical methods Density based methods Grid based methods Model based methods 82. Differentiate between lazy learner and eager learner? Nearest neighbor classifiers are lazy learners in that they store all of the training samples and do not build a classifier until a new (unlabeled) sample needs to be classified. In eager learning methods such as decision tree induction, back propagation constructs a generalized model before receiving a new sample to classify. 83. What is network pruning? The first step forwards extracting rules from neural networks pruning. This consists of removing weighted links that do not result in a decrease in the classification accuracy of the given network. 84. List the various criteria of classification in data mining system? Kinds of databases mined Kinds of knowledge mined Kinds of techniques utilized Application adapted 14

15 85. Name some data mining techniques? Statistics Machine learning Decision trees Hidden markov model Artificial neural networks Genetic algorithms Meta learning 86. Explain DBMiner tool in data mining? System Architecture Input and Output Data mining tasks supported by the system Support for task and method selection Support of the KDD process Main applications Current status 87. Define Iceberg query? It computes an aggregate function over an attribute or set of attributes in order to find aggregate values above some specified threshold. Given relation R with attributes a1,a2,..an and b, and an aggregate function, agg_f, an iceberg query is the form Select R.a1,R.a2,.,R.an, agg_f(r.b) from relation R group by R.a1,R.a2,.,R.an having agg_f(r.b)>= threshold 88. Define DBMiner? DBMiner is an Online Analytical Mining System, developed fro interactive mining of multiple level knowledge in large relational databases and data warehouses. 89. List out the DBMiner tasks? OLAP analyzer Association Classification Clustering Prediction Time series analysis. 15

16 90. Explain how data mining is used in Health care analysis? Healthcare data mining and its aims Healthcare data mining technique Segmenting patients into groups Identifying patients with recurring health problems Relation between disease and symptoms Curbing the treatment costs Predicting medical diagnosis Medical research Hospital administration Applications of data mining in Healthcare 91. Explain Data mining applications for financial data analysis? Loan payment prediction and customer credit policy analysis. Classification and clustering of customers for targeted marketing. Detection of money laundering and other financial crimes. 92. Explain Data mining applications for the Telecommunication industry? Multidimensional analysis of telecommunication data. Fraudulent pattern analysis and the identification of unusual patterns. Multidimensional association and sequential pattern analysis. Use of visualization tools in telecommunication data analysis. 93. Define Spatial Data Warehouse? A Spatial Data warehouse is a subject oriented, integrated, time variant and non-volatile collection of both spatial and non-spatial data in support of spatial data mining and spatial data related decision making process. 94. What are the different types of dimensions in a spatial data cube? A non spatial dimension. A spatial to non spatial dimensions. A spatial to spatial dimensions. 95. Define Spatial Association rule? A spatial association rule is of the form A B[s%, c%].where A,B are sets of spatial or non spatial predicates, s% is support of the rule and c% is the confidence of the rule. 16

17 96. Define Horizontal Parallelism? Horizontal Parallelism which means that the database is partitioned across multiple disks and parallel processing occurs within a specific task that is performed concurrently on different processors against different sets of data. 97. Define Vertical Parallelism? Vertical Parallelism which occurs among different tasks all component query operations is executed in parallel in a pipelined fashion. 98. What is the need for OLAP? To analyze data stored in database To analyze different dimensions in multidimensional database. 99. Explain the various types of variables used in clustering? Interval scaled variables Binary variables o Symmetric binary variables o Asymmetric binary variables Nominal variables Ordinal variables Ratio-scaled variables 100. Explain the hierarchical method of clustering? Agglomerative and Divisive hierarchical clustering BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies) CURE(Clustering Using REpresentatives) Chameleon 17

Data Mining: Concepts and Techniques. Jiawei Han. Micheline Kamber. Simon Fräser University К MORGAN KAUFMANN PUBLISHERS. AN IMPRINT OF Elsevier

Data Mining: Concepts and Techniques. Jiawei Han. Micheline Kamber. Simon Fräser University К MORGAN KAUFMANN PUBLISHERS. AN IMPRINT OF Elsevier Data Mining: Concepts and Techniques Jiawei Han Micheline Kamber Simon Fräser University К MORGAN KAUFMANN PUBLISHERS AN IMPRINT OF Elsevier Contents Foreword Preface xix vii Chapter I Introduction I I.

More information

DATA WAREHOUSING AND OLAP TECHNOLOGY

DATA WAREHOUSING AND OLAP TECHNOLOGY DATA WAREHOUSING AND OLAP TECHNOLOGY Manya Sethi MCA Final Year Amity University, Uttar Pradesh Under Guidance of Ms. Shruti Nagpal Abstract DATA WAREHOUSING and Online Analytical Processing (OLAP) are

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association

More information

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP Data Warehousing and End-User Access Tools OLAP and Data Mining Accompanying growth in data warehouses is increasing demands for more powerful access tools providing advanced analytical capabilities. Key

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

More information

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

Chapter 5. Warehousing, Data Acquisition, Data. Visualization Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization 5-1 Learning Objectives

More information

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing

More information

Introduction. A. Bellaachia Page: 1

Introduction. A. Bellaachia Page: 1 Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

More information

Fluency With Information Technology CSE100/IMT100

Fluency With Information Technology CSE100/IMT100 Fluency With Information Technology CSE100/IMT100 ),7 Larry Snyder & Mel Oyler, Instructors Ariel Kemp, Isaac Kunen, Gerome Miklau & Sean Squires, Teaching Assistants University of Washington, Autumn 1999

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

CS1011: DATA WAREHOUSING AND MINING TWO MARKS QUESTIONS AND ANSWERS

CS1011: DATA WAREHOUSING AND MINING TWO MARKS QUESTIONS AND ANSWERS CS1011: DATA WAREHOUSING AND MINING TWO MARKS QUESTIONS AND ANSWERS 1.Define Data mining. It refers to extracting or mining knowledge from large amount of data. Data mining is a process of discovering

More information

DATA WAREHOUSE E KNOWLEDGE DISCOVERY

DATA WAREHOUSE E KNOWLEDGE DISCOVERY DATA WAREHOUSE E KNOWLEDGE DISCOVERY Prof. Fabio A. Schreiber Dipartimento di Elettronica e Informazione Politecnico di Milano DATA WAREHOUSE (DW) A TECHNIQUE FOR CORRECTLY ASSEMBLING AND MANAGING DATA

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)

More information

DATA MINING CONCEPTS AND TECHNIQUES. Marek Maurizio E-commerce, winter 2011

DATA MINING CONCEPTS AND TECHNIQUES. Marek Maurizio E-commerce, winter 2011 DATA MINING CONCEPTS AND TECHNIQUES Marek Maurizio E-commerce, winter 2011 INTRODUCTION Overview of data mining Emphasis is placed on basic data mining concepts Techniques for uncovering interesting data

More information

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH Kalinka Mihaylova Kaloyanova St. Kliment Ohridski University of Sofia, Faculty of Mathematics and Informatics Sofia 1164, Bulgaria

More information

International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 ISSN 2229-5518

International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 ISSN 2229-5518 International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 Over viewing issues of data mining with highlights of data warehousing Rushabh H. Baldaniya, Prof H.J.Baldaniya,

More information

CHAPTER-24 Mining Spatial Databases

CHAPTER-24 Mining Spatial Databases CHAPTER-24 Mining Spatial Databases 24.1 Introduction 24.2 Spatial Data Cube Construction and Spatial OLAP 24.3 Spatial Association Analysis 24.4 Spatial Clustering Methods 24.5 Spatial Classification

More information

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1 Slide 29-1 Chapter 29 Overview of Data Warehousing and OLAP Chapter 29 Outline Purpose of Data Warehousing Introduction, Definitions, and Terminology Comparison with Traditional Databases Characteristics

More information

Data Warehousing and Data Mining

Data Warehousing and Data Mining Data Warehousing and Data Mining Winter Semester 2010/2011 Free University of Bozen, Bolzano DW Lecturer: Johann Gamper gamper@inf.unibz.it DM Lecturer: Mouna Kacimi mouna.kacimi@unibz.it http://www.inf.unibz.it/dis/teaching/dwdm/index.html

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

Data Mining System, Functionalities and Applications: A Radical Review

Data Mining System, Functionalities and Applications: A Radical Review Data Mining System, Functionalities and Applications: A Radical Review Dr. Poonam Chaudhary System Programmer, Kurukshetra University, Kurukshetra Abstract: Data Mining is the process of locating potentially

More information

Search and Data Mining: Techniques. Applications Anya Yarygina Boris Novikov

Search and Data Mining: Techniques. Applications Anya Yarygina Boris Novikov Search and Data Mining: Techniques Applications Anya Yarygina Boris Novikov Introduction Data mining applications Data mining system products and research prototypes Additional themes on data mining Social

More information

Data Warehouse: Introduction

Data Warehouse: Introduction Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of base and data mining group,

More information

Foundations of Business Intelligence: Databases and Information Management

Foundations of Business Intelligence: Databases and Information Management Foundations of Business Intelligence: Databases and Information Management Problem: HP s numerous systems unable to deliver the information needed for a complete picture of business operations, lack of

More information

Building Data Cubes and Mining Them. Jelena Jovanovic Email: jeljov@fon.bg.ac.yu

Building Data Cubes and Mining Them. Jelena Jovanovic Email: jeljov@fon.bg.ac.yu Building Data Cubes and Mining Them Jelena Jovanovic Email: jeljov@fon.bg.ac.yu KDD Process KDD is an overall process of discovering useful knowledge from data. Data mining is a particular step in the

More information

Data Warehousing and Data Mining in Business Applications

Data Warehousing and Data Mining in Business Applications 133 Data Warehousing and Data Mining in Business Applications Eesha Goel CSE Deptt. GZS-PTU Campus, Bathinda. Abstract Information technology is now required in all aspect of our lives that helps in business

More information

Data Mining Introduction

Data Mining Introduction Data Mining Introduction Organization Lectures Mondays and Thursdays from 10:30 to 12:30 Lecturer: Mouna Kacimi Office hours: appointment by email Labs Thursdays from 14:00 to 16:00 Teaching Assistant:

More information

Data Mining. Vera Goebel. Department of Informatics, University of Oslo

Data Mining. Vera Goebel. Department of Informatics, University of Oslo Data Mining Vera Goebel Department of Informatics, University of Oslo 2011 1 Lecture Contents Knowledge Discovery in Databases (KDD) Definition and Applications OLAP Architectures for OLAP and KDD KDD

More information

Data Mining Solutions for the Business Environment

Data Mining Solutions for the Business Environment Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over

More information

Importance or the Role of Data Warehousing and Data Mining in Business Applications

Importance or the Role of Data Warehousing and Data Mining in Business Applications Journal of The International Association of Advanced Technology and Science Importance or the Role of Data Warehousing and Data Mining in Business Applications ATUL ARORA ANKIT MALIK Abstract Information

More information

Chapter 20: Data Analysis

Chapter 20: Data Analysis Chapter 20: Data Analysis Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 20: Data Analysis Decision Support Systems Data Warehousing Data Mining Classification

More information

An Overview of Database management System, Data warehousing and Data Mining

An Overview of Database management System, Data warehousing and Data Mining An Overview of Database management System, Data warehousing and Data Mining Ramandeep Kaur 1, Amanpreet Kaur 2, Sarabjeet Kaur 3, Amandeep Kaur 4, Ranbir Kaur 5 Assistant Prof., Deptt. Of Computer Science,

More information

Sanjeev Kumar. contribute

Sanjeev Kumar. contribute RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 sanjeevk@iasri.res.in 1. Introduction The field of data mining and knowledgee discovery is emerging as a

More information

Principles of Data Mining by Hand&Mannila&Smyth

Principles of Data Mining by Hand&Mannila&Smyth Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences

More information

Mario Guarracino. Data warehousing

Mario Guarracino. Data warehousing Data warehousing Introduction Since the mid-nineties, it became clear that the databases for analysis and business intelligence need to be separate from operational. In this lecture we will review the

More information

Overview. Background. Data Mining Analytics for Business Intelligence and Decision Support

Overview. Background. Data Mining Analytics for Business Intelligence and Decision Support Mining Analytics for Business Intelligence and Decision Support Chid Apte, PhD Manager, Abstraction Research Group IBM TJ Watson Research Center apte@us.ibm.com http://www.research.ibm.com/dar Overview

More information

Foundations of Business Intelligence: Databases and Information Management

Foundations of Business Intelligence: Databases and Information Management Foundations of Business Intelligence: Databases and Information Management Content Problems of managing data resources in a traditional file environment Capabilities and value of a database management

More information

Data Warehousing and Data Mining

Data Warehousing and Data Mining Data Warehousing and Data Mining Winter Semester 2012/2013 Free University of Bozen, Bolzano DM Lecturer: Mouna Kacimi mouna.kacimi@unibz.it http://www.inf.unibz.it/dis/teaching/dwdm/index.html Organization

More information

Data Mining as Part of Knowledge Discovery in Databases (KDD)

Data Mining as Part of Knowledge Discovery in Databases (KDD) Mining as Part of Knowledge Discovery in bases (KDD) Presented by Naci Akkøk as part of INF4180/3180, Advanced base Systems, fall 2003 (based on slightly modified foils of Dr. Denise Ecklund from 6 November

More information

Data Mining and Database Systems: Where is the Intersection?

Data Mining and Database Systems: Where is the Intersection? Data Mining and Database Systems: Where is the Intersection? Surajit Chaudhuri Microsoft Research Email: surajitc@microsoft.com 1 Introduction The promise of decision support systems is to exploit enterprise

More information

Data Mining - Introduction

Data Mining - Introduction Data Mining - Introduction Peter Brezany Institut für Scientific Computing Universität Wien Tel. 4277 39425 Sprechstunde: Di, 13.00-14.00 Outline Business Intelligence and its components Knowledge discovery

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT Learning Objectives

Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT Learning Objectives Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT Learning Objectives Describe how the problems of managing data resources in a traditional file environment are solved

More information

(b) How data mining is different from knowledge discovery in databases (KDD)? Explain.

(b) How data mining is different from knowledge discovery in databases (KDD)? Explain. Q2. (a) List and describe the five primitives for specifying a data mining task. Data Mining Task Primitives (b) How data mining is different from knowledge discovery in databases (KDD)? Explain. IETE

More information

The basic data mining algorithms introduced may be enhanced in a number of ways.

The basic data mining algorithms introduced may be enhanced in a number of ways. DATA MINING TECHNOLOGIES AND IMPLEMENTATIONS The basic data mining algorithms introduced may be enhanced in a number of ways. Data mining algorithms have traditionally assumed data is memory resident,

More information

SPATIAL DATA CLASSIFICATION AND DATA MINING

SPATIAL DATA CLASSIFICATION AND DATA MINING , pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal

More information

Data Mining Analytics for Business Intelligence and Decision Support

Data Mining Analytics for Business Intelligence and Decision Support Data Mining Analytics for Business Intelligence and Decision Support Chid Apte, T.J. Watson Research Center, IBM Research Division Knowledge Discovery and Data Mining (KDD) techniques are used for analyzing

More information

The Data Mining Process

The Data Mining Process Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data

More information

A Review of Data Mining Techniques

A Review of Data Mining Techniques Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,

More information

Knowledge Discovery and Data Mining. Structured vs. Non-Structured Data

Knowledge Discovery and Data Mining. Structured vs. Non-Structured Data Knowledge Discovery and Data Mining Unit # 2 1 Structured vs. Non-Structured Data Most business databases contain structured data consisting of well-defined fields with numeric or alphanumeric values.

More information

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10 1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom

More information

COURSE RECOMMENDER SYSTEM IN E-LEARNING

COURSE RECOMMENDER SYSTEM IN E-LEARNING International Journal of Computer Science and Communication Vol. 3, No. 1, January-June 2012, pp. 159-164 COURSE RECOMMENDER SYSTEM IN E-LEARNING Sunita B Aher 1, Lobo L.M.R.J. 2 1 M.E. (CSE)-II, Walchand

More information

Course 103402 MIS. Foundations of Business Intelligence

Course 103402 MIS. Foundations of Business Intelligence Oman College of Management and Technology Course 103402 MIS Topic 5 Foundations of Business Intelligence CS/MIS Department Organizing Data in a Traditional File Environment File organization concepts Database:

More information

Data Warehousing and OLAP Technology for Knowledge Discovery

Data Warehousing and OLAP Technology for Knowledge Discovery 542 Data Warehousing and OLAP Technology for Knowledge Discovery Aparajita Suman Abstract Since time immemorial, libraries have been generating services using the knowledge stored in various repositories

More information

Oracle9i Data Warehouse Review. Robert F. Edwards Dulcian, Inc.

Oracle9i Data Warehouse Review. Robert F. Edwards Dulcian, Inc. Oracle9i Data Warehouse Review Robert F. Edwards Dulcian, Inc. Agenda Oracle9i Server OLAP Server Analytical SQL Data Mining ETL Warehouse Builder 3i Oracle 9i Server Overview 9i Server = Data Warehouse

More information

BUILDING BLOCKS OF DATAWAREHOUSE. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT

BUILDING BLOCKS OF DATAWAREHOUSE. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT BUILDING BLOCKS OF DATAWAREHOUSE G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT 1 Data Warehouse Subject Oriented Organized around major subjects, such as customer, product, sales. Focusing on

More information

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland Data Mining and Knowledge Discovery in Databases (KDD) State of the Art Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland 1 Conference overview 1. Overview of KDD and data mining 2. Data

More information

1. OLAP is an acronym for a. Online Analytical Processing b. Online Analysis Process c. Online Arithmetic Processing d. Object Linking and Processing

1. OLAP is an acronym for a. Online Analytical Processing b. Online Analysis Process c. Online Arithmetic Processing d. Object Linking and Processing 1. OLAP is an acronym for a. Online Analytical Processing b. Online Analysis Process c. Online Arithmetic Processing d. Object Linking and Processing 2. What is a Data warehouse a. A database application

More information

Classification algorithm in Data mining: An Overview

Classification algorithm in Data mining: An Overview Classification algorithm in Data mining: An Overview S.Neelamegam #1, Dr.E.Ramaraj *2 #1 M.phil Scholar, Department of Computer Science and Engineering, Alagappa University, Karaikudi. *2 Professor, Department

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.1 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Classification vs. Numeric Prediction Prediction Process Data Preparation Comparing Prediction Methods References Classification

More information

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise, Evangelos Pournaras, Dirk Helbing 1 Overview Main principles of data mining Definition

More information

Customer Classification And Prediction Based On Data Mining Technique

Customer Classification And Prediction Based On Data Mining Technique Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor

More information

Distance Learning and Examining Systems

Distance Learning and Examining Systems Lodz University of Technology Distance Learning and Examining Systems - Theory and Applications edited by Sławomir Wiak Konrad Szumigaj HUMAN CAPITAL - THE BEST INVESTMENT The project is part-financed

More information

Data Mining + Business Intelligence. Integration, Design and Implementation

Data Mining + Business Intelligence. Integration, Design and Implementation Data Mining + Business Intelligence Integration, Design and Implementation ABOUT ME Vijay Kotu Data, Business, Technology, Statistics BUSINESS INTELLIGENCE - Result Making data accessible Wider distribution

More information

not possible or was possible at a high cost for collecting the data.

not possible or was possible at a high cost for collecting the data. Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day

More information

Data Mining: Overview. What is Data Mining?

Data Mining: Overview. What is Data Mining? Data Mining: Overview What is Data Mining? Recently * coined term for confluence of ideas from statistics and computer science (machine learning and database methods) applied to large databases in science,

More information

COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction

COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction COMP3420: Advanced Databases and Data Mining Classification and prediction: Introduction and Decision Tree Induction Lecture outline Classification versus prediction Classification A two step process Supervised

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION 1 CHAPTER 1 INTRODUCTION Exploration is a process of discovery. In the database exploration process, an analyst executes a sequence of transformations over a collection of data structures to discover useful

More information

Example application (1) Telecommunication. Lecture 1: Data Mining Overview and Process. Example application (2) Health

Example application (1) Telecommunication. Lecture 1: Data Mining Overview and Process. Example application (2) Health Lecture 1: Data Mining Overview and Process What is data mining? Example applications Definitions Multi disciplinary Techniques Major challenges The data mining process History of data mining Data mining

More information

Week 3 lecture slides

Week 3 lecture slides Week 3 lecture slides Topics Data Warehouses Online Analytical Processing Introduction to Data Cubes Textbook reference: Chapter 3 Data Warehouses A data warehouse is a collection of data specifically

More information

Data Warehousing and Data Mining. A.A. 04-05 Datawarehousing & Datamining 1

Data Warehousing and Data Mining. A.A. 04-05 Datawarehousing & Datamining 1 Data Warehousing and Data Mining A.A. 04-05 Datawarehousing & Datamining 1 Outline 1. Introduction and Terminology 2. Data Warehousing 3. Data Mining Association rules Sequential patterns Classification

More information

Classification and Prediction

Classification and Prediction Classification and Prediction Slides for Data Mining: Concepts and Techniques Chapter 7 Jiawei Han and Micheline Kamber Intelligent Database Systems Research Lab School of Computing Science Simon Fraser

More information

OLAP and OLTP. AMIT KUMAR BINDAL Associate Professor M M U MULLANA

OLAP and OLTP. AMIT KUMAR BINDAL Associate Professor M M U MULLANA OLAP and OLTP AMIT KUMAR BINDAL Associate Professor Databases Databases are developed on the IDEA that DATA is one of the critical materials of the Information Age Information, which is created by data,

More information

Foundations of Business Intelligence: Databases and Information Management

Foundations of Business Intelligence: Databases and Information Management Chapter 6 Foundations of Business Intelligence: Databases and Information Management 6.1 2010 by Prentice Hall LEARNING OBJECTIVES Describe how the problems of managing data resources in a traditional

More information

Data Preprocessing. Week 2

Data Preprocessing. Week 2 Data Preprocessing Week 2 Topics Data Types Data Repositories Data Preprocessing Present homework assignment #1 Team Homework Assignment #2 Read pp. 227 240, pp. 250 250, and pp. 259 263 the text book.

More information

B.Sc (Computer Science) Database Management Systems UNIT-V

B.Sc (Computer Science) Database Management Systems UNIT-V 1 B.Sc (Computer Science) Database Management Systems UNIT-V Business Intelligence? Business intelligence is a term used to describe a comprehensive cohesive and integrated set of tools and process used

More information

Data Mining for Successful Healthcare Organizations

Data Mining for Successful Healthcare Organizations Data Mining for Successful Healthcare Organizations For successful healthcare organizations, it is important to empower the management and staff with data warehousing-based critical thinking and knowledge

More information

Chapter 2 Literature Review

Chapter 2 Literature Review Chapter 2 Literature Review 2.1 Data Mining The amount of data continues to grow at an enormous rate even though the data stores are already vast. The primary challenge is how to make the database a competitive

More information

PRACTICAL DATA MINING IN A LARGE UTILITY COMPANY

PRACTICAL DATA MINING IN A LARGE UTILITY COMPANY QÜESTIIÓ, vol. 25, 3, p. 509-520, 2001 PRACTICAL DATA MINING IN A LARGE UTILITY COMPANY GEORGES HÉBRAIL We present in this paper the main applications of data mining techniques at Electricité de France,

More information

Data Mining for Knowledge Management. Classification

Data Mining for Knowledge Management. Classification 1 Data Mining for Knowledge Management Classification Themis Palpanas University of Trento http://disi.unitn.eu/~themis Data Mining for Knowledge Management 1 Thanks for slides to: Jiawei Han Eamonn Keogh

More information

Data Mining. Knowledge Discovery, Data Warehousing and Machine Learning Final remarks. Lecturer: JERZY STEFANOWSKI

Data Mining. Knowledge Discovery, Data Warehousing and Machine Learning Final remarks. Lecturer: JERZY STEFANOWSKI Data Mining Knowledge Discovery, Data Warehousing and Machine Learning Final remarks Lecturer: JERZY STEFANOWSKI Email: Jerzy.Stefanowski@cs.put.poznan.pl Data Mining a step in A KDD Process Data mining:

More information

Statistical Models in Data Mining

Statistical Models in Data Mining Statistical Models in Data Mining Sargur N. Srihari University at Buffalo The State University of New York Department of Computer Science and Engineering Department of Biostatistics 1 Srihari Flood of

More information

Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data

Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data INFO 1500 Introduction to IT Fundamentals 5. Database Systems and Managing Data Resources Learning Objectives 1. Describe how the problems of managing data resources in a traditional file environment are

More information

Databases in Organizations

Databases in Organizations The following is an excerpt from a draft chapter of a new enterprise architecture text book that is currently under development entitled Enterprise Architecture: Principles and Practice by Brian Cameron

More information

When to consider OLAP?

When to consider OLAP? When to consider OLAP? Author: Prakash Kewalramani Organization: Evaltech, Inc. Evaltech Research Group, Data Warehousing Practice. Date: 03/10/08 Email: erg@evaltech.com Abstract: Do you need an OLAP

More information

www.ijreat.org Published by: PIONEER RESEARCH & DEVELOPMENT GROUP (www.prdg.org) 28

www.ijreat.org Published by: PIONEER RESEARCH & DEVELOPMENT GROUP (www.prdg.org) 28 Data Warehousing - Essential Element To Support Decision- Making Process In Industries Ashima Bhasin 1, Mr Manoj Kumar 2 1 Computer Science Engineering Department, 2 Associate Professor, CSE Abstract SGT

More information

4-06-35. John R. Vacca INSIDE

4-06-35. John R. Vacca INSIDE 4-06-35 INFORMATION MANAGEMENT: STRATEGY, SYSTEMS, AND TECHNOLOGIES ONLINE DATA MINING John R. Vacca INSIDE Online Analytical Modeling (OLAM); OLAM Architecture and Features; Implementation Mechanisms;

More information

Data Warehouse Snowflake Design and Performance Considerations in Business Analytics

Data Warehouse Snowflake Design and Performance Considerations in Business Analytics Journal of Advances in Information Technology Vol. 6, No. 4, November 2015 Data Warehouse Snowflake Design and Performance Considerations in Business Analytics Jiangping Wang and Janet L. Kourik Walker

More information

The University of Jordan

The University of Jordan The University of Jordan Master in Web Intelligence Non Thesis Department of Business Information Technology King Abdullah II School for Information Technology The University of Jordan 1 STUDY PLAN MASTER'S

More information

14. Data Warehousing & Data Mining

14. Data Warehousing & Data Mining 14. Data Warehousing & Data Mining Data Warehousing Concepts Decision support is key for companies wanting to turn their organizational data into an information asset Data Warehouse "A subject-oriented,

More information

Foundations of Business Intelligence: Databases and Information Management

Foundations of Business Intelligence: Databases and Information Management Chapter 5 Foundations of Business Intelligence: Databases and Information Management 5.1 Copyright 2011 Pearson Education, Inc. Student Learning Objectives How does a relational database organize data,

More information

New Approach of Computing Data Cubes in Data Warehousing

New Approach of Computing Data Cubes in Data Warehousing International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 14 (2014), pp. 1411-1417 International Research Publications House http://www. irphouse.com New Approach of

More information

Delivering Business Intelligence With Microsoft SQL Server 2005 or 2008 HDT922 Five Days

Delivering Business Intelligence With Microsoft SQL Server 2005 or 2008 HDT922 Five Days or 2008 Five Days Prerequisites Students should have experience with any relational database management system as well as experience with data warehouses and star schemas. It would be helpful if students

More information

Data Mining for Retail Website Design and Enhanced Marketing

Data Mining for Retail Website Design and Enhanced Marketing Data Mining for Retail Website Design and Enhanced Marketing Inaugural-Dissertation zur Erlangung des Doktorgrades der Mathematisch-Naturwissenschaftlichen Fakultät der Heinrich-Heine-Universität Düsseldorf

More information

Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1. Dejan Sarka Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

More information

Chapter 6. Foundations of Business Intelligence: Databases and Information Management

Chapter 6. Foundations of Business Intelligence: Databases and Information Management Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:

More information

Big Data: Rethinking Text Visualization

Big Data: Rethinking Text Visualization Big Data: Rethinking Text Visualization Dr. Anton Heijs anton.heijs@treparel.com Treparel April 8, 2013 Abstract In this white paper we discuss text visualization approaches and how these are important

More information

Unsupervised Data Mining (Clustering)

Unsupervised Data Mining (Clustering) Unsupervised Data Mining (Clustering) Javier Béjar KEMLG December 01 Javier Béjar (KEMLG) Unsupervised Data Mining (Clustering) December 01 1 / 51 Introduction Clustering in KDD One of the main tasks in

More information