The Application of Exploratory Data Analysis (EDA) in Auditing
|
|
|
- Arthur Lee
- 10 years ago
- Views:
Transcription
1 The Application of Exploratory Data Analysis (EDA) in Auditing Qi Liu Ph.D. Candidate (A.B.D.) Dept. of Accounting & Information Systems Rutgers University 28 th WCARS November 9, 2013
2 Outline Introduction An overview of EDA concept EDA in Auditing An application of EDA in auditing A credit card retention case Future Research 2
3 Introduction Motivation Audit is a data intensive process; data analysis plays an important role in audit process. Current data analysis approaches used in auditing process focus on validating predefined audit objectives, which can not discover unaware risks from the data. EDA is often linked to detective work, and one of its objectives is to identify outliers. Even though some EDA techniques have been used in some auditing procedures, EDA has never been systematically employed in auditing. Contribution This research contributes to the auditing literature by taking the first cut to use exploratory data analysis in auditing and illustrate a real-world application in audit process. 3
4 Definition of EDA Exploratory data analysis (EDA) is a data analysis approach emphasizing on pattern recognition and hypothesis generation. 4
5 EDA vs CDA Confirmatory Data Analysis (CDA) is a widely used data analysis approach emphasizing on experimental design, significance testing, estimation, and prediction (Good, 1983). Exploratory Data Analysis (EDA) Confirmatory Data Analysis (CDA) Reasoning Type Inductive Deductive Goal Applied Data Techniques Pattern Recognition and Hypothesis generation Observation Data (data collected without well-defined hypothesis) Descriptive Statistics, Data Visualization, Clustering Analysis, Process Mining Advantages No assumptions required Promotes deeper understanding of the data Disadvantages No conclusive answers Difficult to avoid bias produced by overfitting Estimation, Modeling, Hypothesis testing Experimental data (data collected through formally designed experiments) Traditional statistical techniques of inference, significance, and confidence Precise Well-established theory and methods Required unrealistic assumptions Difficult to notice unexpected results 5
6 Current Applications of EDA Since 1980s, EDA has been applied to diversified disciplines such as interior design, marketing, industrial engineering, and geography (Chen et al., 2011; Nayaka and Yano, 2010; Koschat and Sabavala, 1994; Wesley et al., 2006; De Mast and Trip, 2007, 2009). A framework to apply EDA in practical problem solving issues include: (1) display the data; (2) identify salient features; (3) interpret salient features (De Mast and Kemper, 2009). 6
7 Framework to apply EDA in auditing Step 1 Display the distribution of related fields Step 2 Identify salient features from the distribution Step 3 Perform CDA to test possible explanations Step 4 Identify suspicious cases Step 5 Explore the causes of abnormal cases Step 6 Perform CDA to confirm the relationship Step 7 Report the risks and recommend improvement suggestions 7 Add a new audit objective
8 Purpose Credit Card Retention Case Demonstrate the benefits of applying EDA in audit process Provide a real example to support the proposed guidelines Scenario: Clients call the bank asking for a reduction of their card fees. Bank representatives offer discounts to clients to retain their accounts. Objectives: identify the situations of loss of revenue in the negotiation of fees caused by bank representatives, as b: bank representatives offer higher discounts than allowed bank representatives usually offer the highest allowable discounts without putting enough efforts to negotiate lower discounts bank representatives offer discounts without any negotiation with the clients 8
9 Data Description Data (Retention Dataset) Each record represents a customer call 195,694 records 162 fields Time frame: January, 2012 Selected Attributes Original fee (VLR_ANUIDADE_G) Actual fee (_Valor da Anuidade de Saída) Agent identification (Funcional do Agente) Supervisor identification (Funcional do Supervisor) Location of the customer service center (Polo de Atendimento) Call duration (Tempo de Atendimento de Retenti) 9
10 Data Preprocess Discount Calculation Methodology Applied EDA techniques Descriptive Statistics Data Visualization Data Transformation 10
11 Discount Results Analysis (1/8) Policy-violating bank representatives and negative discounts Bank policy allows bank representatives to offer discounts up to 100% of the annuity to retain the customer >100% 90%-100% 80%-90% 70%-80% 60%-70% 50%-60% 40%-50% 30%-40% 20%-30% 10%-20% 0-10% <0 0% 1.21% 0.69% 0.59% 0.15% 5.71% 4.60% Descriptive statistics of discounts 11.51% 15.14% 13.39% 15.66% 31.34% 0.00% 5.00% 10.00% 15.00% 20.00% 25.00% 30.00% 35.00% Frequency distribution of discounts 11
12 Results Analysis (2/8) Policy-violating bank representatives and negative discounts C o u n t Discount Distribution of negative discounts 12
13 Results Analysis (3/8) Policy-violating bank representatives and negative discounts Relationships between negative discounts and original and actual fees New Audit Objective: Actual fees are recorded correctly. Original fees reflect the number of cards in an account. 13
14 Results Analysis (4/8) Effortless bank representatives and inactive representatives Bank representatives who always offer 100% discounts should be considered not putting enough effort to negotiate with the clients for a lower discount. F r e q u e n c y Bank Representatitves ID Frequency in total Frequency in 100% discount subset Distribution of bank representatives offered 100% discounts in the whole retention data and the 100% discount subset 14
15 Results Analysis (5/8) Effortless bank representatives and inactive representatives Descriptive statistics of frequency distribution of bank representatives Active representatives 65% 35% Inactive representatives 32% Supervisor 3% Distribution of bank representatives 15
16 Results Analysis (6/8) Effortless bank representatives and inactive representatives P e r c e n t a g e 35.00% 30.00% 25.00% 20.00% 15.00% 10.00% 5.00% 0.00% 25.97% 11.28% Frequency of active representatives Frequency of inactive representatives Comparison of active and inactive representatives on frequency distribution of discounts % 90.00% 80.00% 70.00% 60.00% 50.00% 40.00% 30.00% 20.00% 10.00% 0.00% 47.33% 85.95% 12.30% 7.84% 26.47% 13.90% 4.32% 1.89% Sao Paulo Rio dejaneiro Salvador Nao Especificado Distribution of bank representatives Frequency of active representatives Frequency of inactive representatives 16
17 Results Analysis (7/8) Non-negotiation bank representatives and short calls Bank representatives who offer a discount without negotiation usually related to short call duration Descriptive statistics of call duration Frequency distribution of call duration less than 600 seconds 17
18 Results Analysis (8/8) Non-negotiation bank representatives and short calls One possible explanation for these unreasonable short calls is that these calls are forced to terminate due to bad network connection. 80 A v e r a g e d i s o c u n t s Call duration Relationship between call duration and discounts New Audit Objective: All the discounts are given in calls long enough to offer discounts. 18
19 Future research directions Demonstrate the application of EDA in the audit of financial statement related business cycle. Demonstrate the application of EDA in other types of auditing. Extend current framework to continuous auditing environment. Explore the application of other EDA technologies in auditing. Explore the most suitable EDA techniques for each audit procedure. 19
20 Thank You! 20
Why do statisticians "hate" us?
Why do statisticians "hate" us? David Hand, Heikki Mannila, Padhraic Smyth "Data mining is the analysis of (often large) observational data sets to find unsuspected relationships and to summarize the data
Appendix B Checklist for the Empirical Cycle
Appendix B Checklist for the Empirical Cycle This checklist can be used to design your research, write a report about it (internal report, published paper, or thesis), and read a research report written
How is Big Data Different? A Paradigm Shift
How is Big Data Different? A Paradigm Shift Jennifer Clarke, Ph.D. Associate Professor Department of Statistics Department of Food Science and Technology University of Nebraska Lincoln ASA Snake River
Data, Measurements, Features
Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are
Now we begin our discussion of exploratory data analysis.
Now we begin our discussion of exploratory data analysis. 1 Remember to keep in mind where we are in the big picture. For now, we will assume that the data we are given is a representative sample from
Audit Analytics. --An innovative course at Rutgers. Qi Liu. Roman Chinchila
Audit Analytics --An innovative course at Rutgers Qi Liu Roman Chinchila A new certificate in Analytic Auditing Tentative courses: Audit Analytics Special Topics in Audit Analytics Forensic Accounting
Data Mining Classification: Decision Trees
Data Mining Classification: Decision Trees Classification Decision Trees: what they are and how they work Hunt s (TDIDT) algorithm How to select the best split How to handle Inconsistent data Continuous
Data Warehousing and Data Mining for improvement of Customs Administration in India. Lessons learnt overseas for implementation in India
Data Warehousing and Data Mining for improvement of Customs Administration in India Lessons learnt overseas for implementation in India Participants Shailesh Kumar (Group Leader) Sameer Chitkara (Asst.
Index Contents Page No. Introduction . Data Mining & Knowledge Discovery
Index Contents Page No. 1. Introduction 1 1.1 Related Research 2 1.2 Objective of Research Work 3 1.3 Why Data Mining is Important 3 1.4 Research Methodology 4 1.5 Research Hypothesis 4 1.6 Scope 5 2.
Data Mining for Fun and Profit
Data Mining for Fun and Profit Data mining is the extraction of implicit, previously unknown, and potentially useful information from data. - Ian H. Witten, Data Mining: Practical Machine Learning Tools
Introduction. A. Bellaachia Page: 1
Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.
A Statistical Text Mining Method for Patent Analysis
A Statistical Text Mining Method for Patent Analysis Department of Statistics Cheongju University, [email protected] Abstract Most text data from diverse document databases are unsuitable for analytical
Data Mining: Overview. What is Data Mining?
Data Mining: Overview What is Data Mining? Recently * coined term for confluence of ideas from statistics and computer science (machine learning and database methods) applied to large databases in science,
Data Warehouse design
Data Warehouse design Design of Enterprise Systems University of Pavia 21/11/2013-1- Data Warehouse design DATA PRESENTATION - 2- BI Reporting Success Factors BI platform success factors include: Performance
Qi Liu Rutgers Business School ISACA New York 2013
Qi Liu Rutgers Business School ISACA New York 2013 1 What is Audit Analytics The use of data analysis technology in Auditing. Audit analytics is the process of identifying, gathering, validating, analyzing,
Brillig Systems Making Projects Successful
Metrics for Successful Automation Project Management Most automation engineers spend their days controlling manufacturing processes, but spend little or no time controlling their project schedule and budget.
Application in Predictive Analytics. FirstName LastName. Northwestern University
Application in Predictive Analytics FirstName LastName Northwestern University Prepared for: Dr. Nethra Sambamoorthi, Ph.D. Author Note: Final Assignment PRED 402 Sec 55 Page 1 of 18 Contents Introduction...
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts
Machine Learning in Hospital Billing Management. 1. George Mason University 2. INOVA Health System
Machine Learning in Hospital Billing Management Janusz Wojtusiak 1, Che Ngufor 1, John M. Shiver 1, Ronald Ewald 2 1. George Mason University 2. INOVA Health System Introduction The purpose of the described
Chapter ML:XI. XI. Cluster Analysis
Chapter ML:XI XI. Cluster Analysis Data Mining Overview Cluster Analysis Basics Hierarchical Cluster Analysis Iterative Cluster Analysis Density-Based Cluster Analysis Cluster Evaluation Constrained Cluster
Dr. Tarek A. Tutunji Philadelphia University, Jordan. Engineering Skills, Philadelphia University
Philadelphia University, Jordan Project Management In the previous section, Engineering Responsibility was presented. In this session, Project Management will be introduced. This session is divided into
Database Marketing, Business Intelligence and Knowledge Discovery
Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski
An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
Classification and Prediction
Classification and Prediction Slides for Data Mining: Concepts and Techniques Chapter 7 Jiawei Han and Micheline Kamber Intelligent Database Systems Research Lab School of Computing Science Simon Fraser
Exploratory Data Analysis
Exploratory Data Analysis 4 March 2009 Research Methods for Empirical Computer Science CMPSCI 691DD Edwin Hubble What did Hubble see? What did Hubble see? Hubble s Law V = H 0 r Where: V = recessional
Advanced Analytics for Audit Case Selection
Advanced Analytics for Audit Case Selection New York State Capitol Building Albany, New York Presented by: Tim Gardinier, Manager, Data Processing Services Audit Case Selection Putting a scientific approach
By: Omar AL-Rawajfah, RN, PhD
By: Omar AL-Rawajfah, RN, PhD What Is Nursing Research? Research: diligent, systematic inquiry that used disciplined method to answer question or solve problem Nursing Research: issues related to the profession
South Carolina College- and Career-Ready (SCCCR) Probability and Statistics
South Carolina College- and Career-Ready (SCCCR) Probability and Statistics South Carolina College- and Career-Ready Mathematical Process Standards The South Carolina College- and Career-Ready (SCCCR)
DESCRIPTIVE RESEARCH DESIGNS
DESCRIPTIVE RESEARCH DESIGNS Sole Purpose: to describe a behavior or type of subject not to look for any specific relationships, nor to correlate 2 or more variables Disadvantages since setting is completely
Divide-n-Discover Discretization based Data Exploration Framework for Healthcare Analytics
for Healthcare Analytics Si-Chi Chin,KiyanaZolfaghar,SenjutiBasuRoy,AnkurTeredesai,andPaulAmoroso Institute of Technology, The University of Washington -Tacoma,900CommerceStreet,Tacoma,WA980-00,U.S.A.
The SAS Analytics Shootout Annual Student Competition
The SAS Analytics Shootout Annual Student Competition Sponsored by SAS and The Institute for Health & Business Insight at Central Michigan University Joseph Pomerville, Research Analyst / Advanced Analytics
Welcome. Data Mining: Updates in Technologies. Xindong Wu. Colorado School of Mines Golden, Colorado 80401, USA
Welcome Xindong Wu Data Mining: Updates in Technologies Dept of Math and Computer Science Colorado School of Mines Golden, Colorado 80401, USA Email: xwu@ mines.edu Home Page: http://kais.mines.edu/~xwu/
DATA MINING CONCEPTS AND TECHNIQUES. Marek Maurizio E-commerce, winter 2011
DATA MINING CONCEPTS AND TECHNIQUES Marek Maurizio E-commerce, winter 2011 INTRODUCTION Overview of data mining Emphasis is placed on basic data mining concepts Techniques for uncovering interesting data
ADVANCED DATA VISUALIZATION
If I can't picture it, I can't understand it. Albert Einstein ADVANCED DATA VISUALIZATION REDUCE TO THE TIME TO INSIGHT AND DRIVE DATA DRIVEN DECISION MAKING Mark Wolff, Ph.D. Principal Industry Consultant
Azure Machine Learning, SQL Data Mining and R
Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:
PERCEIVED VALUE OF BENEFITS FOR PROJECT MANAGERS COMPENSATION. Răzvan NISTOR 1 Ioana BELEIU 2 Marius RADU 3
PERCEIVED VALUE OF BENEFITS FOR PROJECT MANAGERS COMPENSATION Răzvan NISTOR 1 Ioana BELEIU 2 Marius RADU 3 ABSTRACT The article examines how the manager role characteristics are perceived, valued and promoted
Building A Smart Academic Advising System Using Association Rule Mining
Building A Smart Academic Advising System Using Association Rule Mining Raed Shatnawi +962795285056 [email protected] Qutaibah Althebyan +962796536277 [email protected] Baraq Ghalib & Mohammed
Knowledge Discovery and Data Mining. Structured vs. Non-Structured Data
Knowledge Discovery and Data Mining Unit # 2 1 Structured vs. Non-Structured Data Most business databases contain structured data consisting of well-defined fields with numeric or alphanumeric values.
Using Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
Lecture 2: Descriptive Statistics and Exploratory Data Analysis
Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals
TIETS34 Seminar: Data Mining on Biometric identification
TIETS34 Seminar: Data Mining on Biometric identification Youming Zhang Computer Science, School of Information Sciences, 33014 University of Tampere, Finland [email protected] Course Description Content
Data Mining/Fraud Detection. April 28, 2014 Jonathan Meyer, CPA KPMG, LLP
Data Mining/Fraud Detection April 28, 2014 Jonathan Meyer, CPA KPMG, LLP 1 Agenda Overview of Data Analytics & Fraud Getting Started with Data Analytics Where to Look & Why? What is Possible? 2 D&A Business
Data Mining for Knowledge Management. Classification
1 Data Mining for Knowledge Management Classification Themis Palpanas University of Trento http://disi.unitn.eu/~themis Data Mining for Knowledge Management 1 Thanks for slides to: Jiawei Han Eamonn Keogh
Screen Design : Navigation, Windows, Controls, Text,
Overview Introduction Fundamentals of GUIs Screen Design : Navigation, Windows, Controls, Text, Evaluating GUI Performance - Methods - Comparison 1 Example: Automotive HMI (CAR IT 03/2013) 64, 68, 69 2
DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.
DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,
CPSC 340: Machine Learning and Data Mining. Mark Schmidt University of British Columbia Fall 2015
CPSC 340: Machine Learning and Data Mining Mark Schmidt University of British Columbia Fall 2015 Outline 1) Intro to Machine Learning and Data Mining: Big data phenomenon and types of data. Definitions
College of Health and Human Services. Fall 2013. Syllabus
College of Health and Human Services Fall 2013 Syllabus information placement Instructor description objectives HAP 780 : Data Mining in Health Care Time: Mondays, 7.20pm 10pm (except for 3 rd lecture
Analysis Issues II. Mary Foulkes, PhD Johns Hopkins University
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
Audit Data Analytics. Bob Dohrer, IAASB Member and Working Group Chair. Miklos Vasarhelyi Phillip McCollough
Audit Data Analytics Bob Dohrer, IAASB Member and Working Group Chair Miklos Vasarhelyi Phillip McCollough IAASB Meeting September 2015 Agenda Item 6-A Page 1 Audit Data Analytics Agenda Introductory remarks
Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland
Data Mining and Knowledge Discovery in Databases (KDD) State of the Art Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland 1 Conference overview 1. Overview of KDD and data mining 2. Data
Preprocessing Web Logs for Web Intrusion Detection
Preprocessing Web Logs for Web Intrusion Detection Priyanka V. Patil. M.E. Scholar Department of computer Engineering R.C.Patil Institute of Technology, Shirpur, India Dharmaraj Patil. Department of Computer
Umbrella & Excess Liability - Understanding & Quantifying Price Movement
Umbrella & Excess Liability - Understanding & Quantifying Price Movement Survey of Common Umbrella Price Monitoring Methods Jason Kundrot CARe Seminar on Reinsurance, 1 Survey of Common Umbrella Price
Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics
Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),
Protein Protein Interaction Networks
Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics
Implementation Date Fall 2008. Marketing Pathways
PROGRAM CONCENTRATION: CAREER PATHWAY: COURSE TITLE: Marketing, Sales & Service Optional Course for All Marketing Pathways Marketing Research In this course, high school students will gain an understanding
Distributed forests for MapReduce-based machine learning
Distributed forests for MapReduce-based machine learning Ryoji Wakayama, Ryuei Murata, Akisato Kimura, Takayoshi Yamashita, Yuji Yamauchi, Hironobu Fujiyoshi Chubu University, Japan. NTT Communication
Research Methods: Qualitative Approach
Research Methods: Qualitative Approach Sharon E. McKenzie, PhD, MS, CTRS, CDP Assistant Professor/Research Scientist Coordinator Gerontology Certificate Program Kean University Dept. of Physical Education,
2014/02/13 Sphinx Lunch
2014/02/13 Sphinx Lunch Best Student Paper Award @ 2013 IEEE Workshop on Automatic Speech Recognition and Understanding Dec. 9-12, 2013 Unsupervised Induction and Filling of Semantic Slot for Spoken Dialogue
Statistics Review PSY379
Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses
Decision Trees from large Databases: SLIQ
Decision Trees from large Databases: SLIQ C4.5 often iterates over the training set How often? If the training set does not fit into main memory, swapping makes C4.5 unpractical! SLIQ: Sort the values
Practical Data Science with Azure Machine Learning, SQL Data Mining, and R
Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be
Testing Multiple Secondary Endpoints in Confirmatory Comparative Studies -- A Regulatory Perspective
Testing Multiple Secondary Endpoints in Confirmatory Comparative Studies -- A Regulatory Perspective Lilly Yue, Ph.D. Chief, Cardiovascular and Ophthalmic Devices Branch Division of Biostatistics CDRH/FDA
T-61.6010 Non-discriminatory Machine Learning
T-61.6010 Non-discriminatory Machine Learning Seminar 1 Indrė Žliobaitė Aalto University School of Science, Department of Computer Science Helsinki Institute for Information Technology (HIIT) University
Data Mining and Machine Learning in Bioinformatics
Data Mining and Machine Learning in Bioinformatics PRINCIPAL METHODS AND SUCCESSFUL APPLICATIONS Ruben Armañanzas http://mason.gmu.edu/~rarmanan Adapted from Iñaki Inza slides http://www.sc.ehu.es/isg
PSPSOHS602A Develop OHS information and data analysis and reporting and recording processes
PSPSOHS602A Develop OHS information and data analysis and reporting and recording processes Release: 3 PSPSOHS602A Develop OHS information and data analysis and reporting and recording processes Modification
A SURVEY ON GENETIC ALGORITHM FOR INTRUSION DETECTION SYSTEM
A SURVEY ON GENETIC ALGORITHM FOR INTRUSION DETECTION SYSTEM MS. DIMPI K PATEL Department of Computer Science and Engineering, Hasmukh Goswami college of Engineering, Ahmedabad, Gujarat ABSTRACT The Internet
Applied Mathematical Sciences, Vol. 7, 2013, no. 112, 5591-5597 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2013.
Applied Mathematical Sciences, Vol. 7, 2013, no. 112, 5591-5597 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2013.38457 Accuracy Rate of Predictive Models in Credit Screening Anirut Suebsing
Exploratory Data Analysis
Exploratory Data Analysis Paul Cohen ISTA 370 Spring, 2012 Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 1 / 46 Outline Data, revisited The purpose of exploratory data analysis Learning
Data Mining is sometimes referred to as KDD and DM and KDD tend to be used as synonyms
Data Mining Techniques forcrm Data Mining The non-trivial extraction of novel, implicit, and actionable knowledge from large datasets. Extremely large datasets Discovery of the non-obvious Useful knowledge
Asset/liability Management for Universal Life. Grant Paulsen Rimcon Inc. November 15, 2001
Asset/liability Management for Universal Life Grant Paulsen Rimcon Inc. November 15, 2001 Step 1: Split the product in two Premiums Reinsurance Premiums Benefits Policyholder fund Risk charges Expense
Application of SAS! Enterprise Miner in Credit Risk Analytics. Presented by Minakshi Srivastava, VP, Bank of America
Application of SAS! Enterprise Miner in Credit Risk Analytics Presented by Minakshi Srivastava, VP, Bank of America 1 Table of Contents Credit Risk Analytics Overview Journey from DATA to DECISIONS Exploratory
DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM
INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS ISSN 2320-7345 DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM M. Mayilvaganan 1, S. Aparna 2 1 Associate
Application of Predictive Analytics to Higher Degree Research Course Completion Times
Application of Predictive Analytics to Higher Degree Research Course Completion Times Application of Decision Theory to PhD Course Completions (2006 2013) Rachna 1 I Dhand, Senior Strategic Information
Qualitative vs Quantitative research & Multilevel methods
Qualitative vs Quantitative research & Multilevel methods How to include context in your research April 2005 Marjolein Deunk Content What is qualitative analysis and how does it differ from quantitative
Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com
SPSS-SA Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Training Brochure 2009 TABLE OF CONTENTS 1 SPSS TRAINING COURSES FOCUSING
PSYCHOLOGY PROGRAM LEARNING GOALS AND OUTCOMES BY COURSE LISTING
PSYCHOLOGY PROGRAM LEARNING GOALS AND OUTCOMES BY COURSE LISTING Psychology 1010: General Psychology Learning Goals and Outcomes LEARNING GOAL 1: KNOWLEDGE BASE OF PSYCHOLOGY Demonstrate familiarity with
DATA IN A RESEARCH. Tran Thi Ut, FEC/HSU October 10, 2013
DATA IN A RESEARCH Tran Thi Ut, FEC/HSU October 10, 2013 IMPORTANCE OF DATA In research, it needs data for the stages: Research design Sampling design Data gathering and /or field work techniques Data
Data Mining with R. Decision Trees and Random Forests. Hugh Murrell
Data Mining with R Decision Trees and Random Forests Hugh Murrell reference books These slides are based on a book by Graham Williams: Data Mining with Rattle and R, The Art of Excavating Data for Knowledge
Regulatory Pathways for Licensure and Use of Ebola Virus Vaccines During the Current Outbreak FDA Perspective
Regulatory Pathways for Licensure and Use of Ebola Virus Vaccines During the Current Outbreak FDA Perspective Office of Vaccines Research and Review Center for Biologics Evaluation and Research U.S. Food
How To Perform An Ensemble Analysis
Charu C. Aggarwal IBM T J Watson Research Center Yorktown, NY 10598 Outlier Ensembles Keynote, Outlier Detection and Description Workshop, 2013 Based on the ACM SIGKDD Explorations Position Paper: Outlier
Machine Learning. Term 2012/2013 LSI - FIB. Javier Béjar cbea (LSI - FIB) Machine Learning Term 2012/2013 1 / 34
Machine Learning Javier Béjar cbea LSI - FIB Term 2012/2013 Javier Béjar cbea (LSI - FIB) Machine Learning Term 2012/2013 1 / 34 Outline 1 Introduction to Inductive learning 2 Search and inductive learning
A Review of Anomaly Detection Techniques in Network Intrusion Detection System
A Review of Anomaly Detection Techniques in Network Intrusion Detection System Dr.D.V.S.S.Subrahmanyam Professor, Dept. of CSE, Sreyas Institute of Engineering & Technology, Hyderabad, India ABSTRACT:In
BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL
The Fifth International Conference on e-learning (elearning-2014), 22-23 September 2014, Belgrade, Serbia BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL SNJEŽANA MILINKOVIĆ University
Data Driven Security Framework to Success
Data Driven Security Framework to Success Presented by Leonard Jacobs, MBA, CISSP, CSSA Founder, President and CEO of Netsecuris Inc. 1 Topics The Explosion of Security Data Threat-centric Security vs.
Sampling Biases in IP Topology Measurements
Sampling Biases in IP Topology Measurements Anukool Lakhina with John Byers, Mark Crovella and Peng Xie Department of Boston University Discovering the Internet topology Goal: Discover the Internet Router
A Multitier Fraud Analytics and Detection Approach
A Multitier Fraud Analytics and Detection Approach Jay Schindler, PhD MPH DISCLAIMER: The views and opinions expressed in this presentation are those of the author and do not necessarily represent official
Data Mining: Foundation, Techniques and Applications
Data Mining: Foundation, Techniques and Applications Lesson 1b :A Quick Overview of Data Mining Li Cuiping( 李 翠 平 ) School of Information Renmin University of China Anthony Tung( 鄧 锦 浩 ) School of Computing
Lecture 1: Review and Exploratory Data Analysis (EDA)
Lecture 1: Review and Exploratory Data Analysis (EDA) Sandy Eckel [email protected] Department of Biostatistics, The Johns Hopkins University, Baltimore USA 21 April 2008 1 / 40 Course Information I Course
3. Dataset size reduction. 4. BGP-4 patterns. Detection of inter-domain routing problems using BGP-4 protocol patterns P.A.
Newsletter Inter-domain QoS, Issue 8, March 2004 Online monthly journal of INTERMON consortia Dynamic information concerning research, standardisation and practical issues of inter-domain QoS --------------------------------------------------------------------
