ROOT: A data mining tool from CERN What can actuaries do with it?
|
|
|
- Cory Young
- 9 years ago
- Views:
Transcription
1 ROOT: A data mining tool from CERN What can actuaries do with it? Ravi Kumar, Senior Manager, Deloitte Consulting LLP Lucas Lau, Senior Consultant, Deloitte Consulting LLP Southern California Casualty Actuarial Club Fall Meeting Pasadena, California December 3, Background P&C insurance companies are increasingly becoming Data Driven organizations To understand their customers better To improve operational efficiencies To stay profitable To stay ahead of their competitors So, data volumes are growing at an annual rate of 60% or even more. Actuaries are finding it increasingly difficult to manage this data explosion that is happening around them, a data mining tool from CERN, offers a scalable solution that can help actuaries deal with large volumes of data 2 3 Demand for internal data Actuaries are using more detailed internal data in their day to day work Demand for external data Actuaries are constantly looking for new business insights that can be derived from external data Billing Claims Policy & Premium Pharmacy Lexis Nexis House hold information Claimant History Customer Name & Address Financial Census Broker & Vendor Notes (Text Mining) Medical Social Network Data Analysis Analysis Business Insights Business Insights
2 Data, data everywhere Conclusion 6 7 What is ROOT? Root as a Data Storage Framework Root is an efficient Data Storage framework specifically designed to handle Big Data Root stores data in a compressed columnar structure Root is a Data Visualization Tool with extensive data visualization capabilities Root can store complex objects, structures, arrays, images, in addition to simple data types Design is optimized for fast analysis capabilities Root is a Modeling Tool with extensive set of data fitting, modeling and analysis methods including data simulation For example: Root is a complete C++ programming environment Root comes with full memory management, networking, parallel processing capabilities for increased scalability 8 In this example, Root takes only 18% space compared to the corresponding ASCII file 9 Root as a Data Visualization Tool... Root as a Modeling Tool Root provides convenient graphic tools to visualize and explore even very large datasets Multiple Regression Maximum Likelihood Fitting General purpose function minimizers Can fit any user defined function to the data Neural Networks Decision and Regression Trees Boosting and Bagging Principal Component Analysis Rules based predictive learning Support vector machines Several other Classification/Regression Algorithms Histograms 1-, 2- and 3-Dimensional With interactive capabilities With Dynamic re-binning capabilities Curve Fitting Graphs Pie Charts QQ Plots Parallel Coordinates 10 11
3 Bundled Services Price Price of Single Service Optimal Volume Originated ($B) A B Current More about Root Root is available as a free open source software from Root is supported on many operating systems like Windows, Unix, Linux, Mac OS etc. Root is used in many industries High Energy Physics community Wall Street trading companies(merrill Lynch, Renaissance Corp) Flight planning systems (MITRE) Insurance(Nationwide, Deloitte) Defense(USAF, DoD) Oil, geology(haliburton) Medical Imaging(Philips Medical) What are you trying to optimize? Price Optimization is the process of calculating optimal set of prices, taking into consideration demand elasticity, customer preferences and cost of delivering products or services. 1. Define the Objective Function Profit = Quantity * (Price Cost) 2. Determine the Constraints Floor/ceiling prices Inter-product relationship rules (equal, greater than) based on attributes 3. Decide on strategy, trade-offs Revenue versus profit versus market share versus 4. Use Calculus methods (Lagrange multipliers, Linear Programming, etc) to solve for price for multiple strategic scenarios Solve for Prices that maximize Objective subject to Constraints Putting Optimization Into Practice How do we optimize prices? 1. Profitability function formularization : Most institutions can accurately measure performance from an operational standpoint. However, they cannot accurately estimate the profitability of an individual transaction at the time of origination. This requires robust pro forma models that estimate expected income, costs, and risks over the lifetime of a transaction. Accurate transaction-level estimation of profitability is a pre-requisite for a successful pricing initiative. When implementing optimization, a single measure must be selected at a given point in time. Profit Profit Price Volume Variable Fixed = ( x ) - ( Cost + Cost ) 2. Incorporation of constraints to the benefit function: Once benefit function is estimated, restrictions are incorporated (based on internal policies, standard requirements, etc.) that may condition pricing policy. Max Price Increases/Decreases Regulatory Restrictions Supply Constraints. Pricing/Product Rules Translating this situation into optimization problem formulation, we see that the objective is to find the highest point on the hill. Therefore, objective function is the height achieved by the first boy with respect to his original position. The design variables are longitude and latitude the coordinates, defining position of the boy. The constraints are that the boy has to stay inside the fences. Note here, that in general, the boy may start the search from outside the fences. 3. Benefit optimization: The constrained objective function (e.g. profit, revenue, attrition rate, etc.) is optimized through simulation of multiple scenarios, generating efficient frontiers for a segment, market, product, or a combination thereof. Demand Models Benefits Function Efficient Frontier Revenue / Customer Benefit= f (P, Q, X) + Constraints Win Probability Annual Pre-Tax Interest Income (in $MM per $1B originated) 16 17
4 Data Exploratory in Price Elasticity Let s look at the traditional way to do it in SAS: proc univariate. Definition of Elasticity: the ratio of percentage change in quantity sold to percentage change in price Elasticity = QSold QSold P P In order to have an accurate estimate of elasticity, we need to examine the outliers of both numerator and denominator How do they distribute graphically? Here is the graphical representation of distribution After the log transformation, the frequency of extreme values pops X-Axis: % change in Quantity Sold Y-Axis: Frequency X-Axis: % change in price Y-Axis: Frequency X-Axis: % change in Quantity Sold Y-Axis: Frequency (log scale) X-Axis: % change in price Y-Axis: Frequency(log scale) 20 The absolute count on Y-axis does not tell us about the frequency of extreme cases because they are too small relative to the majority. We need to do some transformation to Y-axis. SAS? or ROOT? 21 Just a right click on your mouse, the Y-axis has been transformed to log scale. This save the hassle of programming the log transformation and running the program in SAS! Modeling in Root Use TMinuitMinimizer to minimize the negative of the objective function in ROOT Root is a complete C++ programming environment, so that you can customize your objective function and make your optimization engine as proprietary as possible Root is an efficient tool in running optimization. We compared our internal optimization engines between R and Root. Root outperformed R significantly in terms of running time
5 Conclusion Root is specifically designed to handle, store, and analyze large amounts of data very efficiently Root comes with many Statistical tools Data mining tools Data visualization tools Connectivity tools Bibliography Main ROOT Website Paper on CAS E-Forum introducing Root to the actuaries Toolkit for Multivariate data Analysis with ROOT Root is suitable for any environment requiring serious and scalable data analysis solution 24 25
MSCA 31000 Introduction to Statistical Concepts
MSCA 31000 Introduction to Statistical Concepts This course provides general exposure to basic statistical concepts that are necessary for students to understand the content presented in more advanced
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts
1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number
1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression
The power of IBM SPSS Statistics and R together
IBM Software Business Analytics SPSS Statistics The power of IBM SPSS Statistics and R together 2 Business Analytics Contents 2 Executive summary 2 Why integrate SPSS Statistics and R? 4 Integrating R
Predict the Popularity of YouTube Videos Using Early View Data
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
Advanced analytics at your hands
2.3 Advanced analytics at your hands Neural Designer is the most powerful predictive analytics software. It uses innovative neural networks techniques to provide data scientists with results in a way previously
Optimization: Optimal Pricing with Elasticity
Optimization: Optimal Pricing with Elasticity Short Examples Series using Risk Simulator For more information please visit: www.realoptionsvaluation.com or contact us at: [email protected]
ANALYTICS CENTER LEARNING PROGRAM
Overview of Curriculum ANALYTICS CENTER LEARNING PROGRAM The following courses are offered by Analytics Center as part of its learning program: Course Duration Prerequisites 1- Math and Theory 101 - Fundamentals
Lecture 2: Descriptive Statistics and Exploratory Data Analysis
Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals
Exploratory Spatial Data Analysis
Exploratory Spatial Data Analysis Part II Dynamically Linked Views 1 Contents Introduction: why to use non-cartographic data displays Display linking by object highlighting Dynamic Query Object classification
Studying Auto Insurance Data
Studying Auto Insurance Data Ashutosh Nandeshwar February 23, 2010 1 Introduction To study auto insurance data using traditional and non-traditional tools, I downloaded a well-studied data from http://www.statsci.org/data/general/motorins.
MSCA 31000 Introduction to Statistical Concepts
MSCA 31000 Introduction to Statistical Concepts This course provides general exposure to basic statistical concepts that are necessary for students to understand the content presented in more advanced
A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution
A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September
Customer and Business Analytic
Customer and Business Analytic Applied Data Mining for Business Decision Making Using R Daniel S. Putler Robert E. Krider CRC Press Taylor &. Francis Group Boca Raton London New York CRC Press is an imprint
COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments
Contents List of Figures Foreword Preface xxv xxiii xv Acknowledgments xxix Chapter 1 Fraud: Detection, Prevention, and Analytics! 1 Introduction 2 Fraud! 2 Fraud Detection and Prevention 10 Big Data for
Descriptive Statistics
Descriptive Statistics Descriptive statistics consist of methods for organizing and summarizing data. It includes the construction of graphs, charts and tables, as well various descriptive measures such
Predictive Modeling in Workers Compensation 2008 CAS Ratemaking Seminar
Predictive Modeling in Workers Compensation 2008 CAS Ratemaking Seminar Prepared by Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc. www.data-mines.com [email protected]
Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary
Shape, Space, and Measurement- Primary A student shall apply concepts of shape, space, and measurement to solve problems involving two- and three-dimensional shapes by demonstrating an understanding of:
Notes for EER #4 Graph transformations (vertical & horizontal shifts, vertical stretching & compression, and reflections) of basic functions.
Notes for EER #4 Graph transformations (vertical & horizontal shifts, vertical stretching & compression, and reflections) of basic functions. Basic Functions In several sections you will be applying shifts
The Predictive Data Mining Revolution in Scorecards:
January 13, 2013 StatSoft White Paper The Predictive Data Mining Revolution in Scorecards: Accurate Risk Scoring via Ensemble Models Summary Predictive modeling methods, based on machine learning algorithms
What Does the Normal Distribution Sound Like?
What Does the Normal Distribution Sound Like? Ananda Jayawardhana Pittsburg State University [email protected] Published: June 2013 Overview of Lesson In this activity, students conduct an investigation
Lecture 2. Marginal Functions, Average Functions, Elasticity, the Marginal Principle, and Constrained Optimization
Lecture 2. Marginal Functions, Average Functions, Elasticity, the Marginal Principle, and Constrained Optimization 2.1. Introduction Suppose that an economic relationship can be described by a real-valued
not possible or was possible at a high cost for collecting the data.
Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day
Temperature Scales. The metric system that we are now using includes a unit that is specific for the representation of measured temperatures.
Temperature Scales INTRODUCTION The metric system that we are now using includes a unit that is specific for the representation of measured temperatures. The unit of temperature in the metric system is
The Big Picture. Describing Data: Categorical and Quantitative Variables Population. Descriptive Statistics. Community Coalitions (n = 175)
Describing Data: Categorical and Quantitative Variables Population The Big Picture Sampling Statistical Inference Sample Exploratory Data Analysis Descriptive Statistics In order to make sense of data,
5 Systems of Equations
Systems of Equations Concepts: Solutions to Systems of Equations-Graphically and Algebraically Solving Systems - Substitution Method Solving Systems - Elimination Method Using -Dimensional Graphs to Approximate
Anti-Trust Notice. Agenda. Three-Level Pricing Architect. Personal Lines Pricing. Commercial Lines Pricing. Conclusions Q&A
Achieving Optimal Insurance Pricing through Class Plan Rating and Underwriting Driven Pricing 2011 CAS Spring Annual Meeting Palm Beach, Florida by Beth Sweeney, FCAS, MAAA American Family Insurance Group
KnowledgeSTUDIO HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES
HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES Translating data into business value requires the right data mining and modeling techniques which uncover important patterns within
EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER. Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d.
EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER ANALYTICS LIFECYCLE Evaluate & Monitor Model Formulate Problem Data Preparation Deploy Model Data Exploration Validate Models
Health Spring Meeting May 2008 Session # 42: Dental Insurance What's New, What's Important
Health Spring Meeting May 2008 Session # 42: Dental Insurance What's New, What's Important Floyd Ray Martin, FSA, MAAA Thomas A. McInteer, FSA, MAAA Jonathan P. Polon, FSA Dental Insurance Fraud Detection
IBM SPSS Direct Marketing
IBM Software IBM SPSS Statistics 19 IBM SPSS Direct Marketing Understand your customers and improve marketing campaigns Highlights With IBM SPSS Direct Marketing, you can: Understand your customers in
Chapter 6. The stacking ensemble approach
82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described
Unit 9 Describing Relationships in Scatter Plots and Line Graphs
Unit 9 Describing Relationships in Scatter Plots and Line Graphs Objectives: To construct and interpret a scatter plot or line graph for two quantitative variables To recognize linear relationships, non-linear
The Edge Editions of SAP InfiniteInsight Overview
Analytics Solutions from SAP The Edge Editions of SAP InfiniteInsight Overview Enabling Predictive Insights with Mouse Clicks, Not Computer Code Table of Contents 3 The Case for Predictive Analysis 5 Fast
Vectors. Objectives. Assessment. Assessment. Equations. Physics terms 5/15/14. State the definition and give examples of vector and scalar variables.
Vectors Objectives State the definition and give examples of vector and scalar variables. Analyze and describe position and movement in two dimensions using graphs and Cartesian coordinates. Organize and
ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS
DATABASE MARKETING Fall 2015, max 24 credits Dead line 15.10. ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS PART A Gains chart with excel Prepare a gains chart from the data in \\work\courses\e\27\e20100\ass4b.xls.
WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat
Information Builders enables agile information solutions with business intelligence (BI) and integration technologies. WebFOCUS the most widely utilized business intelligence platform connects to any enterprise
Fairfield Public Schools
Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity
Chapter 4 Online Appendix: The Mathematics of Utility Functions
Chapter 4 Online Appendix: The Mathematics of Utility Functions We saw in the text that utility functions and indifference curves are different ways to represent a consumer s preferences. Calculus can
Linear Programming. Solving LP Models Using MS Excel, 18
SUPPLEMENT TO CHAPTER SIX Linear Programming SUPPLEMENT OUTLINE Introduction, 2 Linear Programming Models, 2 Model Formulation, 4 Graphical Linear Programming, 5 Outline of Graphical Procedure, 5 Plotting
Knowledge Discovery and Data Mining. Bootstrap review. Bagging Important Concepts. Notes. Lecture 19 - Bagging. Tom Kelsey. Notes
Knowledge Discovery and Data Mining Lecture 19 - Bagging Tom Kelsey School of Computer Science University of St Andrews http://tom.host.cs.st-andrews.ac.uk [email protected] Tom Kelsey ID5059-19-B &
II. DISTRIBUTIONS distribution normal distribution. standard scores
Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,
Principles of Dat Da a t Mining Pham Tho Hoan [email protected] [email protected]. n
Principles of Data Mining Pham Tho Hoan [email protected] References [1] David Hand, Heikki Mannila and Padhraic Smyth, Principles of Data Mining, MIT press, 2002 [2] Jiawei Han and Micheline Kamber,
Microeconomics Sept. 16, 2010 NOTES ON CALCULUS AND UTILITY FUNCTIONS
DUSP 11.203 Frank Levy Microeconomics Sept. 16, 2010 NOTES ON CALCULUS AND UTILITY FUNCTIONS These notes have three purposes: 1) To explain why some simple calculus formulae are useful in understanding
A Property and Casualty Insurance Predictive Modeling Process in SAS
Paper 11422-2016 A Property and Casualty Insurance Predictive Modeling Process in SAS Mei Najim, Sedgwick Claim Management Services ABSTRACT Predictive analytics is an area that has been developing rapidly
Lesson 4 What Is a Plant s Life Cycle? The Seasons of a Tree
Lesson 4 What Is a Plant s Life Cycle? The Seasons of a Tree STUDENT SKILLS: predicting, communicating prior observations and knowledge, listening, cooperating, observing, sequencing, communicating, reasoning,
CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht [email protected] 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht [email protected] 539 Sennott
Diagrams and Graphs of Statistical Data
Diagrams and Graphs of Statistical Data One of the most effective and interesting alternative way in which a statistical data may be presented is through diagrams and graphs. There are several ways in
APPLICATION PROGRAMMING: DATA MINING AND DATA WAREHOUSING
Wrocław University of Technology Internet Engineering Henryk Maciejewski APPLICATION PROGRAMMING: DATA MINING AND DATA WAREHOUSING PRACTICAL GUIDE Wrocław (2011) 1 Copyright by Wrocław University of Technology
Data Mining: An Introduction
Data Mining: An Introduction Michael J. A. Berry and Gordon A. Linoff. Data Mining Techniques for Marketing, Sales and Customer Support, 2nd Edition, 2004 Data mining What promotions should be targeted
CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19
PREFACE xi 1 INTRODUCTION 1 1.1 Overview 1 1.2 Definition 1 1.3 Preparation 2 1.3.1 Overview 2 1.3.2 Accessing Tabular Data 3 1.3.3 Accessing Unstructured Data 3 1.3.4 Understanding the Variables and Observations
Price Optimization. For New Business Profit and Growth
Property & Casualty Insurance Price Optimization For New Business Profit and Growth By Angel Marin and Thomas Bayley Non-life companies want to maximize profits from new business. Price optimization can
Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing. C. Olivia Rud, VP, Fleet Bank
Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing C. Olivia Rud, VP, Fleet Bank ABSTRACT Data Mining is a new term for the common practice of searching through
Data Exploration Data Visualization
Data Exploration Data Visualization What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping to select
7 Time series analysis
7 Time series analysis In Chapters 16, 17, 33 36 in Zuur, Ieno and Smith (2007), various time series techniques are discussed. Applying these methods in Brodgar is straightforward, and most choices are
Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs
Types of Variables Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs Quantitative (numerical)variables: take numerical values for which arithmetic operations make sense (addition/averaging)
Visualizing Data. Contents. 1 Visualizing Data. Anthony Tanbakuchi Department of Mathematics Pima Community College. Introductory Statistics Lectures
Introductory Statistics Lectures Visualizing Data Descriptive Statistics I Department of Mathematics Pima Community College Redistribution of this material is prohibited without written permission of the
STATISTICA Solutions for Financial Risk Management Management and Validated Compliance Solutions for the Banking Industry (Basel II)
STATISTICA Solutions for Financial Risk Management Management and Validated Compliance Solutions for the Banking Industry (Basel II) With the New Basel Capital Accord of 2001 (BASEL II) the banking industry
TEACHING AGGREGATE PLANNING IN AN OPERATIONS MANAGEMENT COURSE
TEACHING AGGREGATE PLANNING IN AN OPERATIONS MANAGEMENT COURSE Johnny C. Ho, Turner College of Business, Columbus State University, Columbus, GA 31907 David Ang, School of Business, Auburn University Montgomery,
Modeling Lifetime Value in the Insurance Industry
Modeling Lifetime Value in the Insurance Industry C. Olivia Parr Rud, Executive Vice President, Data Square, LLC ABSTRACT Acquisition modeling for direct mail insurance has the unique challenge of targeting
Data Mining Applications in Higher Education
Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2
Principles of Data Mining by Hand&Mannila&Smyth
Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences
BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I
BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential
Using Excel (Microsoft Office 2007 Version) for Graphical Analysis of Data
Using Excel (Microsoft Office 2007 Version) for Graphical Analysis of Data Introduction In several upcoming labs, a primary goal will be to determine the mathematical relationship between two variable
Sanjeev Kumar. contribute
RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 [email protected] 1. Introduction The field of data mining and knowledgee discovery is emerging as a
Professional Organization Checklist for the Computer Science Curriculum Updates. Association of Computing Machinery Computing Curricula 2008
Professional Organization Checklist for the Computer Science Curriculum Updates Association of Computing Machinery Computing Curricula 2008 The curriculum guidelines can be found in Appendix C of the report
Managerial Economics Prof. Trupti Mishra S.J.M. School of Management Indian Institute of Technology, Bombay. Lecture - 13 Consumer Behaviour (Contd )
(Refer Slide Time: 00:28) Managerial Economics Prof. Trupti Mishra S.J.M. School of Management Indian Institute of Technology, Bombay Lecture - 13 Consumer Behaviour (Contd ) We will continue our discussion
Data, Measurements, Features
Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are
Data Mining Analytics for Business Intelligence and Decision Support
Data Mining Analytics for Business Intelligence and Decision Support Chid Apte, T.J. Watson Research Center, IBM Research Division Knowledge Discovery and Data Mining (KDD) techniques are used for analyzing
Business Intelligence. Data Mining and Optimization for Decision Making
Brochure More information from http://www.researchandmarkets.com/reports/2325743/ Business Intelligence. Data Mining and Optimization for Decision Making Description: Business intelligence is a broad category
Data Visualization Techniques and Practices Introduction to GIS Technology
Data Visualization Techniques and Practices Introduction to GIS Technology Michael Greene Advanced Analytics & Modeling, Deloitte Consulting LLP March 16 th, 2010 Antitrust Notice The Casualty Actuarial
Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining
Data Mining: Exploring Data Lecture Notes for Chapter 3 Introduction to Data Mining by Tan, Steinbach, Kumar What is data exploration? A preliminary exploration of the data to better understand its characteristics.
EdExcel Decision Mathematics 1
EdExcel Decision Mathematics 1 Linear Programming Section 1: Formulating and solving graphically Notes and Examples These notes contain subsections on: Formulating LP problems Solving LP problems Minimisation
MARS STUDENT IMAGING PROJECT
MARS STUDENT IMAGING PROJECT Data Analysis Practice Guide Mars Education Program Arizona State University Data Analysis Practice Guide This set of activities is designed to help you organize data you collect
Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.
Graphical Representations of Data, Mean, Median and Standard Deviation In this class we will consider graphical representations of the distribution of a set of data. The goal is to identify the range of
Part 1: Background - Graphing
Department of Physics and Geology Graphing Astronomy 1401 Equipment Needed Qty Computer with Data Studio Software 1 1.1 Graphing Part 1: Background - Graphing In science it is very important to find and
Lesson 4: Solving and Graphing Linear Equations
Lesson 4: Solving and Graphing Linear Equations Selected Content Standards Benchmarks Addressed: A-2-M Modeling and developing methods for solving equations and inequalities (e.g., using charts, graphs,
The Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
Questions: Does it always take the same amount of force to lift a load? Where should you press to lift a load with the least amount of force?
Lifting A Load 1 NAME LIFTING A LOAD Questions: Does it always take the same amount of force to lift a load? Where should you press to lift a load with the least amount of force? Background Information:
Support Vector Machines Explained
March 1, 2009 Support Vector Machines Explained Tristan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introduction This document has been written in an attempt to make the Support Vector Machines (SVM),
Personal Auto Predictive Modeling Update: What s Next? Roosevelt Mosley, FCAS, MAAA CAS Predictive Modeling Seminar October 6 7, 2008 San Diego, CA
Personal Auto Predictive Modeling Update: What s Next? Roosevelt Mosley, FCAS, MAAA CAS Predictive Modeling Seminar October 6 7, 2008 San Diego, CA You ve Heard Where predictive modeling for auto has been
Didacticiel Études de cas
1 Theme Data Mining with R The rattle package. R (http://www.r project.org/) is one of the most exciting free data mining software projects of these last years. Its popularity is completely justified (see
Session 7 Bivariate Data and Analysis
Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares
Statistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
Decision Trees from large Databases: SLIQ
Decision Trees from large Databases: SLIQ C4.5 often iterates over the training set How often? If the training set does not fit into main memory, swapping makes C4.5 unpractical! SLIQ: Sort the values
GRAPH MATCHING EQUIPMENT/MATERIALS
GRAPH MATCHING LAB MECH 6.COMP. From Physics with Computers, Vernier Software & Technology, 2000. Mathematics Teacher, September, 1994. INTRODUCTION One of the most effective methods of describing motion
Optimization Modeling for Mining Engineers
Optimization Modeling for Mining Engineers Alexandra M. Newman Division of Economics and Business Slide 1 Colorado School of Mines Seminar Outline Linear Programming Integer Linear Programming Slide 2
DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7
DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7 UNDER THE GUIDANCE Dr. N.P. DHAVALE, DGM, INFINET Department SUBMITTED TO INSTITUTE FOR DEVELOPMENT AND RESEARCH IN BANKING TECHNOLOGY
STATISTICA. Financial Institutions. Case Study: Credit Scoring. and
Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT
SAS Certificate Applied Statistics and SAS Programming
SAS Certificate Applied Statistics and SAS Programming SAS Certificate Applied Statistics and Advanced SAS Programming Brigham Young University Department of Statistics offers an Applied Statistics and
Charlesworth School Year Group Maths Targets
Charlesworth School Year Group Maths Targets Year One Maths Target Sheet Key Statement KS1 Maths Targets (Expected) These skills must be secure to move beyond expected. I can compare, describe and solve
