MBA - INFORMATION TECHNOLOGY MANAGEMENT (MBAITM) Term-End Examination December, 2014 MBMI-012 : BUSINESS INTELLIGENCE SECTION I



Similar documents
Data Mining Techniques

Data Mining Techniques Chapter 6: Decision Trees

Data Mining Algorithms Part 1. Dejan Sarka

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and

Product recommendations and promotions (couponing and discounts) Cross-sell and Upsell strategies

Data Mining Applications in Higher Education

Data Mining. 1 Introduction 2 Data Mining methods. Alfred Holl Data Mining 1

Tutorial Segmentation and Classification

Data Mining with SQL Server Data Tools

Role of Neural network in data mining

Course Syllabus. Purposes of Course:

Foundations of Artificial Intelligence. Introduction to Data Mining

The Data Mining Process

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

not possible or was possible at a high cost for collecting the data.

Data Mining In Excel: Lecture Notes and Cases

Title. Introduction to Data Mining. Dr Arulsivanathan Naidoo Statistics South Africa. OECD Conference Cape Town 8-10 December 2010.

Maschinelles Lernen mit MATLAB

BIG DATA What it is and how to use?

ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION

Credit Card Fraud Detection Using Self Organised Map

Predictive Modeling Techniques in Insurance

Social Media Mining. Data Mining Essentials

from Larson Text By Susan Miertschin

Chapter 12 Discovering New Knowledge Data Mining

Supervised and unsupervised learning - 1

Discovering, Not Finding. Practical Data Mining for Practitioners: Level II. Advanced Data Mining for Researchers : Level III

Introduction to Artificial Intelligence G51IAI. An Introduction to Data Mining

Predictive Modeling in Workers Compensation 2008 CAS Ratemaking Seminar

IBM SPSS Direct Marketing 22

A Hybrid Data Mining Model to Improve Customer Response Modeling in Direct Marketing

BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts

Developing Relevant Dining Visits with Oracle Advanced Analytics Olive Garden s transition toward tailoring guests experiences

Data Mining Applications in Fund Raising

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19

STATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and

Machine Learning: Overview

Explode Six Direct Marketing Myths

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS

Azure Machine Learning, SQL Data Mining and R

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

Health Spring Meeting May 2008 Session # 42: Dental Insurance What's New, What's Important

IBM SPSS Direct Marketing 23

W6.B.1. FAQs CS535 BIG DATA W6.B If the distance of the point is additionally less than the tight distance T 2, remove it from the original set

Using Data Mining for Mobile Communication Clustering and Characterization

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Increasing Debit Card Utilization and Generating Revenue using SUPER Segments

Easily Identify the Right Customers

Easily Identify Your Best Customers

CS 6220: Data Mining Techniques Course Project Description

Machine Learning using MapReduce

Performance Metrics for Graph Mining Tasks

Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign

Working with telecommunications

A Basic Guide to Modeling Techniques for All Direct Marketing Challenges

Classification and Prediction

DATA MINING TECHNIQUES AND APPLICATIONS

Data Mining from A to Z: Better Insights, New Opportunities WHITE PAPER

Chapter 7: Data Mining

Apache Hama Design Document v0.6

IBM SPSS Direct Marketing

Maximize Revenues on your Customer Loyalty Program using Predictive Analytics

Lowering social cost of car accidents by predicting high-risk drivers

Data Mining + Business Intelligence. Integration, Design and Implementation

TDA and Machine Learning: Better Together

Introduction to Data Mining

WHITE PAPER: DATA DRIVEN MARKETING DECISIONS IN THE RETAIL INDUSTRY

Revenue s Business Context

Data Mining Part 5. Prediction

K-Means Clustering Tutorial

Hexaware E-book on Predictive Analytics

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES

Where s my real ROI? White Paper #1 February expert Services

KNIME TUTORIAL. Anna Monreale KDD-Lab, University of Pisa

Innovations and Value Creation in Predictive Modeling. David Cummings Vice President - Research

IBA Business Analytics Data Challenge

Three proven methods to achieve a higher ROI from data mining

Data Mining: A Closer Look

8. Machine Learning Applied Artificial Intelligence

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

How To Cluster

ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING)

Banking Analytics Training Program

Advanced analytics at your hands

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

ConsumerViewSM. Insight on more than 235 million consumers and 113 million households to improve your marketing campaigns

«The Five Myths of Predictive Analytics» 1

KnowledgeSTUDIO HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES

RFM Analysis: The Key to Understanding Customer Buying Behavior

Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví. Pavel Kříž. Seminář z aktuárských věd MFF 4.

Using multiple models: Bagging, Boosting, Ensembles, Forests

Data Mining for Business Intelligence. Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner. 2nd Edition

Experian s UK Credit Bureau Scores. Version 1.6

Grid Density Clustering Algorithm

Environmental Remote Sensing GEOG 2021

Transcription:

No. of Printed Pages : 8 I MBMI-012 I MBA - INFORMATION TECHNOLOGY MANAGEMENT (MBAITM) Term-End Examination December, 2014 Time : 3 hours Note : (i) (ii) (iii) (iv) (v) MBMI-012 : BUSINESS INTELLIGENCE Maximum Marks : 100 Section I is compulsory and carries 30 marks Section II carries 70 marks. Answer any five questions. Assume suitable data wherever required. Draw suitable sketches wherever required. Italicized figures to the right indicate maximum marks. SECTION I 1. As part of a promotional campaign, the marketing manager of "Dilshan Retails" wishes to identify the potential buyers to send promotional catalog offers. The manager uses predictive modelling tools to find the potential customers for promotional offers. He takes help from one of the statisticians working in the same company. The objective of the selection is to "Maximize profit with cost". The variables for the selection are as follows : MBMI-012 1 P.T.O.

Variables Purchase(c,2) Rupees Spent Yearly Income Home Value Order Frequency Recency Married(c,2) Name Prefix(c,4) Age Sex(c,2) Telemarket Ind.(c,2) Rents Apartment Occupied <1 Year Domestic Product Apparel Purchase Leisure Product Luxury Items(c,2) Kitchen Product Dishes Purchase Flatware Purchase Total Dining (kitch+dish+fiat) Promo : 1 7 Months Promo : 8 13 Months Value per Mailing Country Code Variables Total Returns Mens Apparel Home Furniture Lamps Purchase Linens Purchase Blankets Purchase Towels Purchase Outdoor Product Coats Purchase Ladies Coats Ladies Apparel His/Her Apparel Jewelry Purchase Date 1st Order Telemarket Order(c,5) Account Number(c,128) State Code(c,55) Race(c,5) Heating Type(c,4) Number of Cars(c,4) Number of Kids Travel Time Education Level(c,4) Job Category Note: 'C' indicates class levels MBMI-012 2

Answer the following : (a) What variables (input and target) are important for identifying the potential customers?. Which method do you prefer for variable selection? 3+3 (b) Identify the suitable predictive analytics technique (model). 2 (c) How do you apply the model using SAS/SEMMA methodology? Explain in detail. 8 2. (a) The following table shows 24 records with their actual class and the probability of them being Class 1 members, as estimated by a classifier. Actual Probability Actual Probability class of Class 1 class of Class 1 1 0.9959 1 0.5055 1 0.99875 0 0.4713 1 0.9844 0 0.3371 1 0.9804 1 0.2179 1 0.9481 0 0.1992 1 0.8892 0 0.1494 1 0.8476 0 0.0479 0 0.7628 0 0.0383 1 0.7069 0 0.0248 1 0.6807 0 0.0218 1 0.6563 0 0.0161 0 0.6224 0 0.0035 Based on the above data, find the misclassification rate when the cutoff is 0.5, 0.25 and 0.75. 10 MBMI-012 3 P.T.O.

(b) Describe the roles assumed by the validation, partition and test partition in supervised learning algorithms. MBMI-012 4

SECTION II 3. A large number of insurance records are to be examined to develop a model for predicting fraudulent claims. Of the claims in the historical database, 1% were judged to be fraud. A sample is taken to develop a model, and oversampling is used to provide a balanced sample in light of the very low response rate. When applied to this sample (N = 800) the model ends up correctly classifying 310 frauds, and 270 no frauds. It missed 90 frauds, and classified 130 records incorrectly as frauds when they were not. (a) Produce the confusion matrix for the sample as it stands. 5 (b) Find the adjusted misclassification rate. 5 (c) What percentage of new records would you expect to be classified as fraudulent? 4 4. (a) For the following classification matrix, find the measures : 2+2+2+2 (i) (ii) (iii) (iv) Overall error Overall accuracy Sensitivity Specificity Predicted 1 0 1 (actual) 201 85 0 (actual) 25 2689 MBMI-012 5 P.T.O.

(b) For the following cases, identify whether the task required is supervised or unsupervised learning. Write two reasons for each case. 2+2+2 (i) (ii) (iii) Deciding whether to issue loan to an applicant based on demographics and financial data (with reference to data base of similar prior customers). Identifying segments of similar customers. Printing of customer discount coupons at the conclusion of a grocery store checkout based on what you just bought and what others have bought previously. 5. Write different distance measures for clustering. Apply k-means clustering algorithm for the following dataset and show clusters with proper distance measures. 5+9 Student Age Subject 1 Subject 2 Subject 3 Si 18 73 75 57 S2 18 79 85 75 S3 23 70 70 52 S4 20 55 55 55 S5 22 85 86 87 S6 19 91 90 85 S7 20 70 65 60 S8 21 53 56 59 S9 19 82 82 60 S10 47 75 76 77 MBMI-012 6

6. (a) What is Artificial Neural Network (ANN)? Explain the general learning process of ANN. Also, judge the following : 3 +4 +3 "A neural network has been constructed to predict the creditworthiness of applicants. There are two output nodes, one for yes (1, yes : 0, no) and one for no (1, no : 0, yes). An applicant received a score of 0.83 for the "yes" output node and a 0.44 for the "no" output node. Discuss what may have happened and whether the applicant is a good credit risk. (b) Explain the factors for choosing the Neural Networks architecture. 4 MBMI-012 7 P.T.O.

7. (a) Apply association rules for the following table, containing telephone faceplate purchases. The numeric 1 (one) indicates presence of an item in the transaction and numeric 0 (zero) indicates absence of transaction. Transaction Red White Blue Orange Green Yellow 1 1 1 0 0 1 0 2 0 1 0 1 0 0 3 0 1 1 0 0 0 4 1 1 0 1 0 0 5 1 0 1 0 0 0 6 0 1 1 0 0 0 7 1 0 1 0 0 0 8 1 1 1 0 1 0 9 1 1 1 0 0 0 10 0 0 0 0 0 1 From the above table, find the association rules with minimum support and minimum confidence of above 50% for red, white and green transactions. 10 (b) Compare and contrast between the measures Confidence and Lift. 4 8. (a) Why are variable measurement level and model role important for model creation? 7 (b) Business intelligence applications enable managers to take decisions in time. Explain it. 7 MBMI-012 8 500