Data Mining Jargon. Bob Muenchen The Statistical Consulting Center

Size: px
Start display at page:

Download "Data Mining Jargon. Bob Muenchen The Statistical Consulting Center"

Transcription

1 Data Mining Jargon Bob Muenchen The Statistical Consulting Center Data mining is the automated search for useful patterns in data. It uses tools from many different disciplines, each of which uses its own technical jargon. This document defines the jargon that is most widely used. A similar document, which translates neural networking jargon into statistical terms, can be found at ftp.sas.com/neural/jargon. If you need assistance, call the Helpdesk at , send to stathelp@utk.edu, or stop by the SCC walk-in support area at 200 Stokely Management Center. All UT students, faculty, and staff researchers can get up to 10 hours of free assistance for their statistical computing needs each semester. See oit.utk.edu/scc for details. We also offer training each semester. See web.utk.edu/~training for details. Analytics the tools of data mining. The major categories of analytics are cluster analysis, decision trees, neural networks, statistical models and association analysis. Analytics that deal with the future are called Predictive Analytics. Since accurate information about the future is so valuable, some view predictive analytics as the core mission of data mining. Artificial Intelligence the field of science that studies how to make computers intelligent. It consists mainly of the fields of Machine Learning (neural networks and decision trees) and expert systems. Artificial Neural Network (ANN) see Neural Network. Association Analysis a data mining tool that discovers combinations of options and their frequency of co-occurrence. For example, 80% of people who buy paint also buy brushes. Essentially a method of rule induction in which all variables are viewed as targets. When applied to products purchased, it is called market basket analysis. Back Office the part of the company that customers don t see, which is run using data stored in an Enterprise Resource Planning system and a Supply Chain Management system. Black Box a term used to describe a model which, although it may work well, is too complex for people to understand. Usually expressible as a long series of incomprehensible equations as in neural networking models. Business Intelligence making better decisions through the use of objective analysis. The four main BI tools are Report & Query, Online Analytical Processing (OLAP), Visualization and Data Mining. Champion Model the model that best solves the data mining problem.

2 Classify developing a model to place records into known categories, e.g. defaulted on loan or not. Class Discovery a term used in biological data mining to refer to unsupervised training or cluster analysis. Class Prediction a term used in biological data mining to refer to supervised training with a categorical target variable. Cluster Analysis developing a model that discovers categories of similar records. Usually performed as a prelude to other analyses. Also called unsupervised training. Concatenate combining datasets or marts so that their columns are aligned and new rows are added. Alignment of the columns is done by using the same column headings, or by a column-by-column manual matching, such as ID in one table might be called SSN in another. CRM see Customer Relationship Management. Curse of Dimensionality refers to the fact that the more variables you study, the larger your dataset needs to be to have a chance at modeling the larger space. The relationship between variables and observations is exponential. This means that to model 10 variables at once, 100 observations may be barely sufficient, but to model 10 times as many variables would require 100 times as many observations, and so on. Removing irrelevant or redundant variables are the two easiest ways to fight the curse. Customer Relationship Management (CRM) is the process of studying and interacting with customers to maximize profits. Luckily, ensuring customer satisfaction is a key way to maximize profitability, but cutting service to unprofitable customers is also involved. This is such a popular use of analysis in business that companies such as SPSS that once said their business was statistical analysis now say it is CRM. CRM is one of the three main areas to which data mining is applied: supply chain management (SCM), enterprise resource planning (ERP) and customer relationship management (CRM). Data Access consists of reading the data for analysis. This may include inputting the data from a flat file, translating a copy of some data from a database or warehouse so that the data mining software can analyze it, or defining a method to read the data directly using a method such as the Open Database Connectivity standard. Data Conversion performing a one-time translation of data from its original format (perhaps stored in a database) into the format used by a data mining package. Example conversion tools are DBMScopy, StatTransfer, Data Junction. See also Data Access and Extract, Transform and Load. Data Cube A data structure of aggregated values summarized for a combination of pre-selected categorical variables e.g. number of items sold and their total cost for each time period, region and product. This structure is required for high-speed analysis of the summaries that is done in Online Analytical Processing (OLAP). Also called a Multidimensional Database or MDDB.

3 Data Management all of the tasks required to manage data such as correcting data entry errors, estimating values of missing data, subsetting or combining sets of data. Data Mart A small data warehouse that is focused on a single area such as a research project or a single department such as sales or accounting. Ideally, all marts in an organization should compatible, but they often differ in structure and file format Data Model Has several very different meanings. To a data miner, it can mean two different things. It can refer to the structure of how a database administrator chooses to store the data in a database or data warehouse, how the tables relate to each other. It can also refer to the way a given database program requires storing data, for example in relational or hierarchical form. If a data analyst uses this term, it refers to the rules or formulas that describe relationships among the variables (see Modeling). Data Quality (DQ) addresses the issues of getting the right measures, ensuring the measures are timely and accurate, that editing is done with controls to prevent errors, that manipulations such as formulas are accurate and documented, that the data is accurately described. Also known as Information Quality or IQ. Data Table A collection of data measurements organized into rectangular columns called fields and rows. Columns contain a single measure, such as blood pressure, for all sampling units. Also called variables, vectors or attributes. Rows contain the measures for a given sampling unit such as all medical information for a person. Also called observations, cases, records or instances. Data Visualization see Visualization. Data Warehouse A static copy of a database that has been optimized for analysis or denormalized. For example, the address of a customer stored for each purchase he makes may waste computer space, but it makes it very easy to find the mean sales for any given geographic region, without knowing the location of the address table. Doing analysis on a data warehouse also prevents analyses from interfering with ongoing data collection. Database A collection of data organized for efficient use in a continuously updated situation, such as frequent sales, reservations. Far and away the most common type of database is a Relational database in which a collection of data tables are stored and related or linked by common key. A key is a column or collection of columns that uniquely identify a row such as Social Security Number. A database is optimized for online transaction processing (e.g. selling products, entering patient information), not for analysis. The optimized state of the database is called normalized or third-normal form. Briefly, it removes redundant information such as the full customer address of every sale, storing it in a separate table. Database Administrator (DBA) the person responsible for organizing the data for an organization. Tasks include changing structure of databases to optimize speed, and of data warehouses to optimize data mining efficiency. In any large organization, this is the person you will need to work with to gain access to the data. He or she will be using many of the terms in this handout when you meet! Decision List see Decision Trees.

4 Decision Support Software (DSS) any software that uses analysis to improve decisionmaking. Also called Decision Support System. Decision Trees a method of finding rules or (rule induction) that divide the data into subgroups that are as similar as possible with regard to a target variable. See the example below for a tree that predicts survival rates for heart attack victims in an emergency room setting (made up for simplicity s sake). The whole training dataset of 100 patients is called the root node. It is divided logically into subgroups called branches that are further subdivided into other branches or finally, leaves. The process of continuing to subdivide the groups is called recursive partitioning. Decision trees are the most popular method of displaying rules. If this sequence of rules is written out in English (or a computer language) it is called a decision list. If the complete set of steps required reaching each decision are written out so that they no longer need to be read in sequence, they are called a rule set. If the decision tree predicts a categorical outcome such as purchase or not, it is called a classification tree. If it predicts a continuous variable such as dollar amount purchased, it is called a regression tree. The most popular decision tree models are called Chi-squared Automatic Interaction Detection (CHAID) Classification and Regression Trees (CART) and C4.5/C Patients 40% Died 60% Lived Blood Pressure < % Died 70% Lived Blood Pressure > % Died 30% Lived Cholesterol < % Died 90% Lived Cholesterol > % Died 80% Lived Cholesterol < % Died 20% Lived Cholesterol > % Died 10% Lived Dependent Variable see Target Variable Drill-Down a request for more detailed information, usually by double-clicking on a number or a part of a graph. For example, a table may show average salaries of professors broken down by department and gender. Drilling down on a cell of that table might display that relationship for each campus. The opposite of roll-up.

5 DSS see Decision Support Software. Ensemble Model a model that combines the results of several types of models. For example, a prediction could use the average estimation from a decision tree, a neural network and a statistical model. Enterprise Resource Planning (ERP) software that stores the core operational data of a businesses operational data such as sales, receivables, payables, in a database. One of the three main areas to which data mining is applied: supply chain management (SCM), enterprise resource planning (ERP) and customer relationship management (CRM). ERP see Enterprise Resource Planning. Estimate develop a model to find an approximate value for a continuous variable, e.g. sales, blood pressure. ETL or ETML stands for Extract, Transform, Move and Load, the steps required to gain access to data for analysis. Since the Move step is the easiest, the M is often left out. ETL is an important subset of Data Management. Executive Information System (EIS) systems any decision maker can use with little training to do ad-hoc analyses, often using OLAP Expert Systems a system that can solve a problem by incorporating the rules manually obtained from human experts. You describe the problem and let it choose how best to solve it. Examples in the area of analysis include SigmaStat, DecisionTime and the SPSS Statistics Coach. Decisions are rather flaky at the moment, but improving. Flat File - data stored in a standard format used to move data from one program to another. Windows and Macintosh call this a Text Only or Text With Line Breaks file. It may also be called an ASCII, EBCDIC (on large IBM computers) or UNICODE file. Front Office the part of a company that customers interact with. Customer data is critical to business profitability, so it is frequently mined. Heuristic see Modeling. IQ see Data Quality. Imputation the process of estimating the values of missing data prior to analysis. Independent Variable see Input Variables. Information Quality see Data Quality. Input Variables the variables thought to be related to, predict or cause the target variable. In data mining, almost any variable that is not the target variable is a candidate for an input variable.

6 Join a database procedure that pools the information stored in different tables so that they can be better analyzed. Key Performance Indicator (KPI) a very important variable. In business, it is a measure that is critically important to the overall functioning of the organization. KPI See Key Performance Indicator. Lift a measure used to compare different data mining models. Essentially it is a measure of how much better you are with the model than without. For example, if 2% of the customers you mail a catalog to would make a purchase but using the model to select catalog recipients 10% would make a purchase, then lift is 10/2 or 5. Machine Learning models that enable the computer to improve its performance through experience, especially rule induction. The definition of learning is so loose that, although rarely mentioned in this context, statistical estimation could also be considered learning. Modeling is roughly synonymous with machine learning. Market Basket Analysis see Association Analysis. Mart see Data Mart. MDDB See Data Cube. Measurment Scale the level of detail in a variable. The measurement scale helps determine the role of the variable in an analysis. Types include: Single-valued variables or constants that result from selected subsets. Binary have only two values such as male/female, purchased/didn't purchase. Nominal contain category memberships such as political party. Also called categorical, class, group, symbolic or qualitative variables. Ordinal variables contain values that have order such as small, medium, large. Interval or continuous variables have meaningful intervals, such as a weight interval of 110 pounds to 120 being the same as 120 to 130. Interval-level variables are also called numeric or scale variables. Merge combining datasets or marts so that their rows are aligned and new columns are added. Row alignment is often done using a key such as an ID number. Metadata Data about the data. Examples are column names such as gender, height; column labels containing descriptors to embellish output. Entire questions from questionnaires are common labels. Formats describe what values mean, such as 0=Female, 1=Male. Also called value labels or codes. Missing value codes, if other than blank, e.g. 999; Scale of each column: nominal, ordinal, interval; Formulas or recoding steps that were followed; Documentation such as who, what, where, when why the data collected were collected; MOLAP see Online Analytical Processing. Modeling generally refers to the process of developing rules which can classify or predict with an estimated level of precision. The rules may be in the form of a series of logical statements or mathematical formula(s).

7 Statistical models are equations that have been mathematically derived to provide the best or optimal description of relationships that involved straight lines, smooth curves, group membership or clusters of similar cases. The solution to these equations usually requires simplifying assumptions about the nature of the data that will not fit every dataset. Heuristic models use methods that have been empirically shown to work well, but which have not been shown to be best or optimal solution. Heuristic models usually make comparatively few assumptions about the nature of the data. Decision trees are an example of an analysis based on heuristics, while discriminant analysis is based on an optimal method (which assumes the data follows a multivariate normal distribution). Multidimensional Database See Data Cube. Neural Networks models that mimic the brain through systems of equations. They learn by being trained with a dataset. Unfortunately, what they learn is conveyed by a series of complex mathematical formulas. These formulas may work well but not explain much about the process they model. See Black Box. ODBC see Open Database Connectivity. OLAP see Online Analytical Processing. OLE DB an open standard for gaining access to the data stored in a multidimensional database. Most OLAP products use OLE DB to access the data. OLTP see Online Transaction Processing. Online Analytical Processing (OLAP) software that quickly displays interactive tables or graphs of pre-selected variables such as sales aggregated by time, region, state, store and product line. A two-dimensional slice of the data might show mean sales broken down by region and product line. Clicking on region might drill down to further divide those numbers by state. Some statistics such as medians and percentiles cannot be used in OLAP due to the data structure OLAP requires (a multidimensional database). OLAP is often not considered data mining since it involves only simple tables and graphs that display what is happening rather than the analytics that can help determine why it is happening or what may happen in the future. However, it is very widely discussed in the data mining literature. OLAP usually displays only sums, counts and means. This is because means at any level of breakdown can be calculated from sums of sums and sums of counts. However, the median of an aggregate is not the aggregate of the medians, which is why medians and percentiles cannot be used in OLAP. This is a major limitation in the method. OLAP is occasionally referred to as MOLAP because it runs very quickly using a Multidimensional database. When OLAP is used with a standard relational database, it is called ROLAP. ROLAP is usually hundreds of times slower than MOLAP.

8 Online Transaction Processing (OLTP) involves the efficient execution of frequent database transactions used to collect data or run a business. These transactions are recorded in a database rather than a data warehouse and are not suitable for analysis until they have been transferred to a warehouse and restructured for efficient analysis. Open Database Connectivity (ODBC) a widely used standard to extract data from a data warehouse to use for analysis. Over-fitted Model a model which has become so complex that it applies only to the dataset upon which it was developed. Another term for this is over-parameterized. Predictive Analytics see Analytics. Profit the amount of profit made in a specific modeling situation, calculated by estimating the cost of each type of error: assuming it is right when it is wrong, and vice versa. It can be used in business for obvious reasons but also in other areas. For example in medicine, you could assign a cost of concluding a patient has a treatable disease when they do not (antibiotic treatment = $35) versus concluding they do not have it when they actually do (treated after complications set in=$2,300 hospital stay). The point of maximum profit would show you the best way to use the model. Qualitative Data depending on the context this term may refer to text data such as messages or to a categorical variables such as gender. Query the process of asking a database questions. Often done in an ad-hoc, interactive way. Regression Analysis a family of statistical models that include fitting straight lines, called linear regression, smoothly curving lines, called polynomial regression or more sharply curving lines, called nonlinear regression; or models that predict group membership, called logistic regression. The main type of data mining that these generalized linear models do not do is clustering. Relational Database See Database. Report A basic listing of database information, which may consist of individual values, sums, counts or means. Often done in a pre-planned, static way. Return on Investment (ROI) the money you saved by doing data mining. It is above and beyond the usually high expenses of purchasing a data mining package, learning to use it and then using it to solve a problem. ROI see Return on Investment. ROLAP see Online Analytical Processing. Roll-Up the process of aggregating numeric data. For example, a series of tables may show professor salaries broken down by department and gender at each campus. The campuses could be rolled up to create a single table of salaries broken down by department and gender for all campuses combined. The opposite of drill-down. Rule Induction see Decision Trees.

9 Rule Set see Decision Trees. SCM see Supply Chain Management. Scoring applying a model to new data usually to predict values of continuous variables such as amount of purchase, or group memberships such as survive/die. Sequel see Structured Query Language. SRM stands for Supplier Relationship Management. See Supply Chain Management. Statistical Models see Modeling. Structured Query Language (SQL) the basic language used in almost all databases. It allows you to search a database basic information such as listing certain records or totaling sums or counts. It also lets you select subsets or samples, or to perform joins. The basic form is SELECT vars FROM tablename WHERE logical condition is true. Often pronounced sequel. Supervised Learning the process of developing a model that has a target variable, such as sales or survival to supervise it. This is opposed to unsupervised learning, which is developing a model without a target, or finding clusters or groups of similar records in your data. When the target variable is categorical, biologists would call this Class Prediction and statisticians would call it Discriminant Analysis or Logistic Regression (two different methods of achieving a similar result). Supplier Relationship Management (SRM) see supply chain management. Supply Chain Management (SCM) is the process of studying and interacting with suppliers to maximize profits. Also called supplier relationship management (SRM). One of the three main areas to which data mining is applied: supply chain management (SCM), enterprise resource planning (ERP) and customer relationship management (CRM). Target Variable the main variable of interest in a data mining project. A business example is the amount of each sale; a medical example is cure/no cure. Also called the predicted, supervisor or dependent variable. Test Data used only once at the end of the data mining project to see if the best or champion model generalizes to completely new data. Text Data written descriptions e.g. open ended survey questions, interviews, customer complaints; also called qualitative data. Text File see flat file. Text Mining the process of automatically finding the key concepts contained in text data. It may also find clusters of similar documents. The numeric output containing presence/absence of each concept and cluster membership is often passed on to a data mining step where it is combined with other numeric data for further analysis.

10 Training Data the data which is used to develop models that will, if done properly, work well on new sets of data. Unsupervised Training/Learning the process of developing a model that does not have a target variable. This boils down to finding clusters of similar cases within the data. It would usually be followed by another analysis that does involve a target variable. Biologists would call this class discovery. Statisticians would call it cluster analysis. Validation Data during data mining, each step of each model developed using the training data is tested using this data to discover the point at which the model becomes overly specific to that single set of data, an over-fitted model. Variable see Data Table. Visualization the use of dynamic, interactive graphical displays to search for useful patterns in data. Warehouse see Data Warehouse.

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP Data Warehousing and End-User Access Tools OLAP and Data Mining Accompanying growth in data warehouses is increasing demands for more powerful access tools providing advanced analytical capabilities. Key

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association

More information

Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1. Dejan Sarka Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

More information

Business Intelligence, Analytics & Reporting: Glossary of Terms

Business Intelligence, Analytics & Reporting: Glossary of Terms Business Intelligence, Analytics & Reporting: Glossary of Terms A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Ad-hoc analytics Ad-hoc analytics is the process by which a user can create a new report

More information

Foundations of Business Intelligence: Databases and Information Management

Foundations of Business Intelligence: Databases and Information Management Foundations of Business Intelligence: Databases and Information Management Problem: HP s numerous systems unable to deliver the information needed for a complete picture of business operations, lack of

More information

Chapter 12 Discovering New Knowledge Data Mining

Chapter 12 Discovering New Knowledge Data Mining Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to

More information

Data Warehousing and Data Mining in Business Applications

Data Warehousing and Data Mining in Business Applications 133 Data Warehousing and Data Mining in Business Applications Eesha Goel CSE Deptt. GZS-PTU Campus, Bathinda. Abstract Information technology is now required in all aspect of our lives that helps in business

More information

Data Mining Applications in Fund Raising

Data Mining Applications in Fund Raising Data Mining Applications in Fund Raising Nafisseh Heiat Data mining tools make it possible to apply mathematical models to the historical data to manipulate and discover new information. In this study,

More information

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing

More information

BENEFITS OF AUTOMATING DATA WAREHOUSING

BENEFITS OF AUTOMATING DATA WAREHOUSING BENEFITS OF AUTOMATING DATA WAREHOUSING Introduction...2 The Process...2 The Problem...2 The Solution...2 Benefits...2 Background...3 Automating the Data Warehouse with UC4 Workload Automation Suite...3

More information

IBM SPSS Direct Marketing 23

IBM SPSS Direct Marketing 23 IBM SPSS Direct Marketing 23 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 23, release

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

Data Mining: Overview. What is Data Mining?

Data Mining: Overview. What is Data Mining? Data Mining: Overview What is Data Mining? Recently * coined term for confluence of ideas from statistics and computer science (machine learning and database methods) applied to large databases in science,

More information

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

More information

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Training Brochure 2009 TABLE OF CONTENTS 1 SPSS TRAINING COURSES FOCUSING

More information

IBM SPSS Direct Marketing 22

IBM SPSS Direct Marketing 22 IBM SPSS Direct Marketing 22 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 22, release

More information

In this presentation, you will be introduced to data mining and the relationship with meaningful use.

In this presentation, you will be introduced to data mining and the relationship with meaningful use. In this presentation, you will be introduced to data mining and the relationship with meaningful use. Data mining refers to the art and science of intelligent data analysis. It is the application of machine

More information

Building Data Cubes and Mining Them. Jelena Jovanovic Email: jeljov@fon.bg.ac.yu

Building Data Cubes and Mining Them. Jelena Jovanovic Email: jeljov@fon.bg.ac.yu Building Data Cubes and Mining Them Jelena Jovanovic Email: jeljov@fon.bg.ac.yu KDD Process KDD is an overall process of discovering useful knowledge from data. Data mining is a particular step in the

More information

Week 3 lecture slides

Week 3 lecture slides Week 3 lecture slides Topics Data Warehouses Online Analytical Processing Introduction to Data Cubes Textbook reference: Chapter 3 Data Warehouses A data warehouse is a collection of data specifically

More information

The Data Mining Process

The Data Mining Process Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data

More information

Data Mining Applications in Higher Education

Data Mining Applications in Higher Education Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2

More information

DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM

DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS ISSN 2320-7345 DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM M. Mayilvaganan 1, S. Aparna 2 1 Associate

More information

When to consider OLAP?

When to consider OLAP? When to consider OLAP? Author: Prakash Kewalramani Organization: Evaltech, Inc. Evaltech Research Group, Data Warehousing Practice. Date: 03/10/08 Email: erg@evaltech.com Abstract: Do you need an OLAP

More information

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat Information Builders enables agile information solutions with business intelligence (BI) and integration technologies. WebFOCUS the most widely utilized business intelligence platform connects to any enterprise

More information

OLAP and OLTP. AMIT KUMAR BINDAL Associate Professor M M U MULLANA

OLAP and OLTP. AMIT KUMAR BINDAL Associate Professor M M U MULLANA OLAP and OLTP AMIT KUMAR BINDAL Associate Professor Databases Databases are developed on the IDEA that DATA is one of the critical materials of the Information Age Information, which is created by data,

More information

Distance Learning and Examining Systems

Distance Learning and Examining Systems Lodz University of Technology Distance Learning and Examining Systems - Theory and Applications edited by Sławomir Wiak Konrad Szumigaj HUMAN CAPITAL - THE BEST INVESTMENT The project is part-financed

More information

Business Intelligence: Effective Decision Making

Business Intelligence: Effective Decision Making Business Intelligence: Effective Decision Making Bellevue College Linda Rumans IT Instructor, Business Division Bellevue College lrumans@bellevuecollege.edu Current Status What do I do??? How do I increase

More information

Business Analytics and Data Visualization. Decision Support Systems Chattrakul Sombattheera

Business Analytics and Data Visualization. Decision Support Systems Chattrakul Sombattheera Business Analytics and Data Visualization Decision Support Systems Chattrakul Sombattheera Agenda Business Analytics (BA): Overview Online Analytical Processing (OLAP) Reports and Queries Multidimensionality

More information

EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER. Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d.

EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER. Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d. EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER ANALYTICS LIFECYCLE Evaluate & Monitor Model Formulate Problem Data Preparation Deploy Model Data Exploration Validate Models

More information

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

Chapter 5. Warehousing, Data Acquisition, Data. Visualization Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization 5-1 Learning Objectives

More information

Easily Identify the Right Customers

Easily Identify the Right Customers PASW Direct Marketing 18 Specifications Easily Identify the Right Customers You want your marketing programs to be as profitable as possible, and gaining insight into the information contained in your

More information

5.5 Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall. Figure 5-2

5.5 Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall. Figure 5-2 Class Announcements TIM 50 - Business Information Systems Lecture 15 Database Assignment 2 posted Due Tuesday 5/26 UC Santa Cruz May 19, 2015 Database: Collection of related files containing records on

More information

Oracle9i Data Warehouse Review. Robert F. Edwards Dulcian, Inc.

Oracle9i Data Warehouse Review. Robert F. Edwards Dulcian, Inc. Oracle9i Data Warehouse Review Robert F. Edwards Dulcian, Inc. Agenda Oracle9i Server OLAP Server Analytical SQL Data Mining ETL Warehouse Builder 3i Oracle 9i Server Overview 9i Server = Data Warehouse

More information

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH Kalinka Mihaylova Kaloyanova St. Kliment Ohridski University of Sofia, Faculty of Mathematics and Informatics Sofia 1164, Bulgaria

More information

University of Gaziantep, Department of Business Administration

University of Gaziantep, Department of Business Administration University of Gaziantep, Department of Business Administration The extensive use of information technology enables organizations to collect huge amounts of data about almost every aspect of their businesses.

More information

Business Intelligence Solutions. Cognos BI 8. by Adis Terzić

Business Intelligence Solutions. Cognos BI 8. by Adis Terzić Business Intelligence Solutions Cognos BI 8 by Adis Terzić Fairfax, Virginia August, 2008 Table of Content Table of Content... 2 Introduction... 3 Cognos BI 8 Solutions... 3 Cognos 8 Components... 3 Cognos

More information

Technology in Action. Alan Evans Kendall Martin Mary Anne Poatsy. Eleventh Edition. Copyright 2015 Pearson Education, Inc.

Technology in Action. Alan Evans Kendall Martin Mary Anne Poatsy. Eleventh Edition. Copyright 2015 Pearson Education, Inc. Copyright 2015 Pearson Education, Inc. Technology in Action Alan Evans Kendall Martin Mary Anne Poatsy Eleventh Edition Copyright 2015 Pearson Education, Inc. Technology in Action Chapter 9 Behind the

More information

Learning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal

Learning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal Learning Example Chapter 18: Learning from Examples 22c:145 An emergency room in a hospital measures 17 variables (e.g., blood pressure, age, etc) of newly admitted patients. A decision is needed: whether

More information

A Property & Casualty Insurance Predictive Modeling Process in SAS

A Property & Casualty Insurance Predictive Modeling Process in SAS Paper AA-02-2015 A Property & Casualty Insurance Predictive Modeling Process in SAS 1.0 ABSTRACT Mei Najim, Sedgwick Claim Management Services, Chicago, Illinois Predictive analytics has been developing

More information

Data Mining Solutions for the Business Environment

Data Mining Solutions for the Business Environment Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over

More information

Gerry Hobbs, Department of Statistics, West Virginia University

Gerry Hobbs, Department of Statistics, West Virginia University Decision Trees as a Predictive Modeling Method Gerry Hobbs, Department of Statistics, West Virginia University Abstract Predictive modeling has become an important area of interest in tasks such as credit

More information

Data Mart/Warehouse: Progress and Vision

Data Mart/Warehouse: Progress and Vision Data Mart/Warehouse: Progress and Vision Institutional Research and Planning University Information Systems What is data warehousing? A data warehouse: is a single place that contains complete, accurate

More information

Chapter 7: Data Mining

Chapter 7: Data Mining Chapter 7: Data Mining Overview Topics discussed: The Need for Data Mining and Business Value The Data Mining Process: Define Business Objectives Get Raw Data Identify Relevant Predictive Variables Gain

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

More information

BUILDING BLOCKS OF DATAWAREHOUSE. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT

BUILDING BLOCKS OF DATAWAREHOUSE. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT BUILDING BLOCKS OF DATAWAREHOUSE G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT 1 Data Warehouse Subject Oriented Organized around major subjects, such as customer, product, sales. Focusing on

More information

CHAPTER 5: BUSINESS ANALYTICS

CHAPTER 5: BUSINESS ANALYTICS Chapter 5: Business Analytics CHAPTER 5: BUSINESS ANALYTICS Objectives The objectives are: Describe Business Analytics. Explain the terminology associated with Business Analytics. Describe the data warehouse

More information

PREFACE INTRODUCTION MULTI-DIMENSIONAL MODEL. Chris Claterbos, Vlamis Software Solutions, Inc. dvlamis@vlamis.com

PREFACE INTRODUCTION MULTI-DIMENSIONAL MODEL. Chris Claterbos, Vlamis Software Solutions, Inc. dvlamis@vlamis.com BUILDING CUBES AND ANALYZING DATA USING ORACLE OLAP 11G Chris Claterbos, Vlamis Software Solutions, Inc. dvlamis@vlamis.com PREFACE As of this writing, Oracle Business Intelligence and Oracle OLAP are

More information

Data are everywhere. IBM projects that every day we generate 2.5 quintillion bytes of data. In relative terms, this means 90

Data are everywhere. IBM projects that every day we generate 2.5 quintillion bytes of data. In relative terms, this means 90 FREE echapter C H A P T E R1 Big Data and Analytics Data are everywhere. IBM projects that every day we generate 2.5 quintillion bytes of data. In relative terms, this means 90 percent of the data in the

More information

MBA 8473 - Data Mining & Knowledge Discovery

MBA 8473 - Data Mining & Knowledge Discovery MBA 8473 - Data Mining & Knowledge Discovery MBA 8473 1 Learning Objectives 55. Explain what is data mining? 56. Explain two basic types of applications of data mining. 55.1. Compare and contrast various

More information

Decision Trees What Are They?

Decision Trees What Are They? Decision Trees What Are They? Introduction...1 Using Decision Trees with Other Modeling Approaches...5 Why Are Decision Trees So Useful?...8 Level of Measurement... 11 Introduction Decision trees are a

More information

Data Mining. Knowledge Discovery, Data Warehousing and Machine Learning Final remarks. Lecturer: JERZY STEFANOWSKI

Data Mining. Knowledge Discovery, Data Warehousing and Machine Learning Final remarks. Lecturer: JERZY STEFANOWSKI Data Mining Knowledge Discovery, Data Warehousing and Machine Learning Final remarks Lecturer: JERZY STEFANOWSKI Email: Jerzy.Stefanowski@cs.put.poznan.pl Data Mining a step in A KDD Process Data mining:

More information

Turning your Warehouse Data into Business Intelligence: Reporting Trends and Visibility Michael Armanious; Vice President Sales and Marketing Datex,

Turning your Warehouse Data into Business Intelligence: Reporting Trends and Visibility Michael Armanious; Vice President Sales and Marketing Datex, Turning your Warehouse Data into Business Intelligence: Reporting Trends and Visibility Michael Armanious; Vice President Sales and Marketing Datex, Inc. Overview Introduction What is Business Intelligence?

More information

Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

More information

1. OLAP is an acronym for a. Online Analytical Processing b. Online Analysis Process c. Online Arithmetic Processing d. Object Linking and Processing

1. OLAP is an acronym for a. Online Analytical Processing b. Online Analysis Process c. Online Arithmetic Processing d. Object Linking and Processing 1. OLAP is an acronym for a. Online Analytical Processing b. Online Analysis Process c. Online Arithmetic Processing d. Object Linking and Processing 2. What is a Data warehouse a. A database application

More information

Data Warehouse design

Data Warehouse design Data Warehouse design Design of Enterprise Systems University of Pavia 21/11/2013-1- Data Warehouse design DATA PRESENTATION - 2- BI Reporting Success Factors BI platform success factors include: Performance

More information

Statistics for BIG data

Statistics for BIG data Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before

More information

IMPLEMENTATION OF DATA WAREHOUSE SAP BW IN THE PRODUCTION COMPANY. Maria Kowal, Galina Setlak

IMPLEMENTATION OF DATA WAREHOUSE SAP BW IN THE PRODUCTION COMPANY. Maria Kowal, Galina Setlak 174 No:13 Intelligent Information and Engineering Systems IMPLEMENTATION OF DATA WAREHOUSE SAP BW IN THE PRODUCTION COMPANY Maria Kowal, Galina Setlak Abstract: in this paper the implementation of Data

More information

Integrated Data Mining and Knowledge Discovery Techniques in ERP

Integrated Data Mining and Knowledge Discovery Techniques in ERP Integrated Data Mining and Knowledge Discovery Techniques in ERP I Gandhimathi Amirthalingam, II Rabia Shaheen, III Mohammad Kousar, IV Syeda Meraj Bilfaqih I,III,IV Dept. of Computer Science, King Khalid

More information

Classification and Prediction

Classification and Prediction Classification and Prediction Slides for Data Mining: Concepts and Techniques Chapter 7 Jiawei Han and Micheline Kamber Intelligent Database Systems Research Lab School of Computing Science Simon Fraser

More information

Data Mining for Fun and Profit

Data Mining for Fun and Profit Data Mining for Fun and Profit Data mining is the extraction of implicit, previously unknown, and potentially useful information from data. - Ian H. Witten, Data Mining: Practical Machine Learning Tools

More information

Enhancing Compliance with Predictive Analytics

Enhancing Compliance with Predictive Analytics Enhancing Compliance with Predictive Analytics FTA 2007 Revenue Estimation and Research Conference Reid Linn Tennessee Department of Revenue reid.linn@state.tn.us Sifting through a Gold Mine of Tax Data

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

Data, Measurements, Features

Data, Measurements, Features Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are

More information

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery Index Contents Page No. 1. Introduction 1 1.1 Related Research 2 1.2 Objective of Research Work 3 1.3 Why Data Mining is Important 3 1.4 Research Methodology 4 1.5 Research Hypothesis 4 1.6 Scope 5 2.

More information

Nine Common Types of Data Mining Techniques Used in Predictive Analytics

Nine Common Types of Data Mining Techniques Used in Predictive Analytics 1 Nine Common Types of Data Mining Techniques Used in Predictive Analytics By Laura Patterson, President, VisionEdge Marketing Predictive analytics enable you to develop mathematical models to help better

More information

How to Get More Value from Your Survey Data

How to Get More Value from Your Survey Data Technical report How to Get More Value from Your Survey Data Discover four advanced analysis techniques that make survey research more effective Table of contents Introduction..............................................................2

More information

Basic Concepts in Research and Data Analysis

Basic Concepts in Research and Data Analysis Basic Concepts in Research and Data Analysis Introduction: A Common Language for Researchers...2 Steps to Follow When Conducting Research...3 The Research Question... 3 The Hypothesis... 4 Defining the

More information

Database Management System Dr. S. Srinath Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No.

Database Management System Dr. S. Srinath Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No. Database Management System Dr. S. Srinath Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No. # 31 Introduction to Data Warehousing and OLAP Part 2 Hello and

More information

Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data

Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data INFO 1500 Introduction to IT Fundamentals 5. Database Systems and Managing Data Resources Learning Objectives 1. Describe how the problems of managing data resources in a traditional file environment are

More information

An Overview of Data Mining: Predictive Modeling for IR in the 21 st Century

An Overview of Data Mining: Predictive Modeling for IR in the 21 st Century An Overview of Data Mining: Predictive Modeling for IR in the 21 st Century Nora Galambos, PhD Senior Data Scientist Office of Institutional Research, Planning & Effectiveness Stony Brook University AIRPO

More information

from Larson Text By Susan Miertschin

from Larson Text By Susan Miertschin Decision Tree Data Mining Example from Larson Text By Susan Miertschin 1 Problem The Maximum Miniatures Marketing Department wants to do a targeted mailing gpromoting the Mythic World line of figurines.

More information

A New Approach for Evaluation of Data Mining Techniques

A New Approach for Evaluation of Data Mining Techniques 181 A New Approach for Evaluation of Data Mining s Moawia Elfaki Yahia 1, Murtada El-mukashfi El-taher 2 1 College of Computer Science and IT King Faisal University Saudi Arabia, Alhasa 31982 2 Faculty

More information

Business Intelligence and Decision Support Systems

Business Intelligence and Decision Support Systems Chapter 12 Business Intelligence and Decision Support Systems Information Technology For Management 7 th Edition Turban & Volonino Based on lecture slides by L. Beaubien, Providence College John Wiley

More information

IT and CRM A basic CRM model Data source & gathering system Database system Data warehouse Information delivery system Information users

IT and CRM A basic CRM model Data source & gathering system Database system Data warehouse Information delivery system Information users 1 IT and CRM A basic CRM model Data source & gathering Database Data warehouse Information delivery Information users 2 IT and CRM Markets have always recognized the importance of gathering detailed data

More information

Chapter 4 Getting Started with Business Intelligence

Chapter 4 Getting Started with Business Intelligence Chapter 4 Getting Started with Business Intelligence Learning Objectives and Learning Outcomes Learning Objectives Getting started on Business Intelligence 1. Understanding Business Intelligence 2. The

More information

CHAPTER 4: BUSINESS ANALYTICS

CHAPTER 4: BUSINESS ANALYTICS Chapter 4: Business Analytics CHAPTER 4: BUSINESS ANALYTICS Objectives Introduction The objectives are: Describe Business Analytics Explain the terminology associated with Business Analytics Describe the

More information

3/17/2009. Knowledge Management BIKM eclassifier Integrated BIKM Tools

3/17/2009. Knowledge Management BIKM eclassifier Integrated BIKM Tools Paper by W. F. Cody J. T. Kreulen V. Krishna W. S. Spangler Presentation by Dylan Chi Discussion by Debojit Dhar THE INTEGRATION OF BUSINESS INTELLIGENCE AND KNOWLEDGE MANAGEMENT BUSINESS INTELLIGENCE

More information

Basics of Dimensional Modeling

Basics of Dimensional Modeling Basics of Dimensional Modeling Data warehouse and OLAP tools are based on a dimensional data model. A dimensional model is based on dimensions, facts, cubes, and schemas such as star and snowflake. Dimensional

More information

Data Warehousing and Data Mining

Data Warehousing and Data Mining Data Warehousing and Data Mining Part I: Data Warehousing Gao Cong gaocong@cs.aau.dk Slides adapted from Man Lung Yiu and Torben Bach Pedersen Course Structure Business intelligence: Extract knowledge

More information

On-Line Application Processing. Warehousing Data Cubes Data Mining

On-Line Application Processing. Warehousing Data Cubes Data Mining On-Line Application Processing Warehousing Data Cubes Data Mining 1 Overview Traditional database systems are tuned to many, small, simple queries. Some new applications use fewer, more time-consuming,

More information

Seamless Dynamic Web Reporting with SAS D.J. Penix, Pinnacle Solutions, Indianapolis, IN

Seamless Dynamic Web Reporting with SAS D.J. Penix, Pinnacle Solutions, Indianapolis, IN Seamless Dynamic Web Reporting with SAS D.J. Penix, Pinnacle Solutions, Indianapolis, IN ABSTRACT The SAS Business Intelligence platform provides a wide variety of reporting interfaces and capabilities

More information

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be

More information

IBM SPSS Direct Marketing 19

IBM SPSS Direct Marketing 19 IBM SPSS Direct Marketing 19 Note: Before using this information and the product it supports, read the general information under Notices on p. 105. This document contains proprietary information of SPSS

More information

DATA WAREHOUSE E KNOWLEDGE DISCOVERY

DATA WAREHOUSE E KNOWLEDGE DISCOVERY DATA WAREHOUSE E KNOWLEDGE DISCOVERY Prof. Fabio A. Schreiber Dipartimento di Elettronica e Informazione Politecnico di Milano DATA WAREHOUSE (DW) A TECHNIQUE FOR CORRECTLY ASSEMBLING AND MANAGING DATA

More information

Chapter 6. Foundations of Business Intelligence: Databases and Information Management

Chapter 6. Foundations of Business Intelligence: Databases and Information Management Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:

More information

How To Make A Credit Risk Model For A Bank Account

How To Make A Credit Risk Model For A Bank Account TRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP Csaba Főző csaba.fozo@lloydsbanking.com 15 October 2015 CONTENTS Introduction 04 Random Forest Methodology 06 Transactional Data Mining Project 17 Conclusions

More information

COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction

COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction COMP3420: Advanced Databases and Data Mining Classification and prediction: Introduction and Decision Tree Induction Lecture outline Classification versus prediction Classification A two step process Supervised

More information

INTRODUCTION TO BUSINESS INTELLIGENCE What to consider implementing a Data Warehouse and Business Intelligence

INTRODUCTION TO BUSINESS INTELLIGENCE What to consider implementing a Data Warehouse and Business Intelligence INTRODUCTION TO BUSINESS INTELLIGENCE What to consider implementing a Data Warehouse and Business Intelligence Summary: This note gives some overall high-level introduction to Business Intelligence and

More information

Ezgi Dinçerden. Marmara University, Istanbul, Turkey

Ezgi Dinçerden. Marmara University, Istanbul, Turkey Economics World, Mar.-Apr. 2016, Vol. 4, No. 2, 60-65 doi: 10.17265/2328-7144/2016.02.002 D DAVID PUBLISHING The Effects of Business Intelligence on Strategic Management of Enterprises Ezgi Dinçerden Marmara

More information

14. Data Warehousing & Data Mining

14. Data Warehousing & Data Mining 14. Data Warehousing & Data Mining Data Warehousing Concepts Decision support is key for companies wanting to turn their organizational data into an information asset Data Warehouse "A subject-oriented,

More information

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1 Slide 29-1 Chapter 29 Overview of Data Warehousing and OLAP Chapter 29 Outline Purpose of Data Warehousing Introduction, Definitions, and Terminology Comparison with Traditional Databases Characteristics

More information

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,

More information

Business Intelligence: Using Data for More Than Analytics

Business Intelligence: Using Data for More Than Analytics Business Intelligence: Using Data for More Than Analytics Session 672 Session Overview Business Intelligence: Using Data for More Than Analytics What is Business Intelligence? Business Intelligence Solution

More information

ANALYTICS CENTER LEARNING PROGRAM

ANALYTICS CENTER LEARNING PROGRAM Overview of Curriculum ANALYTICS CENTER LEARNING PROGRAM The following courses are offered by Analytics Center as part of its learning program: Course Duration Prerequisites 1- Math and Theory 101 - Fundamentals

More information

A Review of Data Mining Techniques

A Review of Data Mining Techniques Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,

More information

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT

More information

Data Mining for Business Analytics

Data Mining for Business Analytics Data Mining for Business Analytics Lecture 2: Introduction to Predictive Modeling Stern School of Business New York University Spring 2014 MegaTelCo: Predicting Customer Churn You just landed a great analytical

More information

Data Mining for Successful Healthcare Organizations

Data Mining for Successful Healthcare Organizations Data Mining for Successful Healthcare Organizations For successful healthcare organizations, it is important to empower the management and staff with data warehousing-based critical thinking and knowledge

More information

Knowledge Discovery and Data Mining. Structured vs. Non-Structured Data

Knowledge Discovery and Data Mining. Structured vs. Non-Structured Data Knowledge Discovery and Data Mining Unit # 2 1 Structured vs. Non-Structured Data Most business databases contain structured data consisting of well-defined fields with numeric or alphanumeric values.

More information

An Introduction to Data Mining

An Introduction to Data Mining An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail

More information