Fast, Frugal and Focused:
|
|
|
- Collin Harvey
- 9 years ago
- Views:
Transcription
1 Fast, Frugal and Focused: When less information leads to better decisions Gregory Wheeler Munich Center for Mathematical Philosophy Ludwig Maximilians University of Munich Konstantinos Katsikopoulos Adaptive Behavior and Cognition Group Max Planck Institute for Human Development MCMP Colloquium, June 25, 2014
2 the total evidence norm Naive Bayes Neural Networks Rational Choice Linear Regression Dynamic Programming...in most situations we might as well throw away our information and toss a coin. - Richard Bellman 2 / 37
3 the total evidence norm Naive Bayes Neural Networks Rational Choice Linear Regression Dynamic Programming Bounded Rationality...in most situations we might as well throw away our information and toss a coin. - Richard Bellman 3 / 37
4 Ignoring Information & Better Predictions: 20 Studies on Economic, Educational and Psychological Predictions 75 Accuracy (% CORRECT) Take The Best Tallying (1/N) Multiple Regression Minimalist Fitting Prediction Czerlinski, Gigerenzer, & Goldstein (1999) 4 / 37
5 heuristic structure and strategic biases Take-the-Best (Gigerenzer & Goldstein 1996) Tallying (1/N) (Dawes 1979) Search Rule: Look up cues in random order Stopping Rule: After m (1 < m N) cues, stop the search. Decision Rule: Predict that the alternative with the higher number of positive cue values has the higher criterion value. Bias: ignore weights 5 / 37
6 heuristic structure and strategic biases Take-the-Best (Gigerenzer & Goldstein 1996) Search Rule: Look up the cue with the highest validity Stopping Rule: If cue values differ (+/ ), stop search. If not, look up next cue. Decision Rule: Predict that the alternative with the positive cue value has the higher criterion value. Tallying (1/N) (Dawes 1979) Search Rule: Look up cues in random order Stopping Rule: After m (1 < m N) cues, stop the search. Decision Rule: Predict that the alternative with the higher number of positive cue values has the higher criterion value. Bias: ignore cues Bias: ignore weights 6 / 37
7 outline 1 st Result A Puzzle 2 nd Result (Puzzle Solved!) Coherentism and Heuristics 7 / 37
8 decision task and setup Forced choice paired comparison task Decide which of two alternatives, A and B, has the larger value on some numerical criterion, C, given their values on n cues X 1,..., X n. 8 / 37
9 decision task and setup Forced choice paired comparison task Decide which of two alternatives, A and B, has the larger value on some numerical criterion, C, given their values on n cues X 1,..., X n. Perfect Discrimination Assumption Each cue discriminates among the alternatives. 9 / 37
10 Ignoring Information & Better Predictions: 20 Studies on Economic, Educational and Psychological Predictions 75 Accuracy (% CORRECT) Take The Best Tallying (1/N) Multiple Regression Minimalist Fitting Prediction Czerlinski, Gigerenzer, & Goldstein (1999) 10 / 37
11 accuracy as a function of size of training sample Leave-one-out Cross Validation There are n + 1 inferences to be made in population - Training sample: n - Test sample: 1 11 / 37
12 accuracy as a function of size of training sample Leave-one-out Cross Validation There are n + 1 inferences to be made in population - Training sample: n - Test sample: 1 Cross-validation is repeated n + 1 times: - Each 1 of n + 1 inferences comprises the test sample once. 12 / 37
13 accuracy as a function of size of training sample Leave-one-out Cross Validation There are n + 1 inferences to be made in population - Training sample: n - Test sample: 1 Cross-validation is repeated n + 1 times: - Each 1 of n + 1 inferences comprises the test sample once. Labeling which cues do best - v maximum cue validity in n + 1 trials; (X is that cue). - v second maximum cue validity. 13 / 37
14 accuracy as a function of size of training sample Leave-one-out Cross Validation There are n + 1 inferences to be made in population - Training sample: n - Test sample: 1 Cross-validation is repeated n + 1 times: - Each 1 of n + 1 inferences comprises the test sample once. Labeling which cues do best - v maximum cue validity in n + 1 trials; (X is that cue). - v second maximum cue validity. Cue covariation ρ is covariation between cues X and X : Pr(X X are correct on trial t) Pr(X is correct on trial t) Pr(X is correct on trial t) 14 / 37
15 α: single-cue predictive accuracy measured by leave-one-out validation Size of training sample α = 1 2 ( v ( 1 + v 1 n + 1 ) + ρ ) 15 / 37
16 α: single-cue predictive accuracy measured by leave-one-out validation Size of training sample α = 1 2 ( Cue Covariation v ( 1 + v 1 n + 1 ) + ρ ) 16 / 37
17 α: single-cue predictive accuracy measured by leave-one-out validation Size of training sample α = 1 2 ( Cue Covariation v ( 1 + v 1 n + 1 ) + ρ v : maximum cue validity in population of n + 1 trials ) 17 / 37
18 α: single-cue predictive accuracy measured by leave-one-out validation Size of training sample α = 1 2 ( Cue Covariation v ( 1 + v 1 n + 1 ) + ρ v : maximum cue validity in population of n + 1 trials Assumptions: ) 18 / 37
19 α: single-cue predictive accuracy measured by leave-one-out validation Size of training sample α = 1 2 ( Cue Covariation v ( 1 + v 1 n + 1 ) + ρ v : maximum cue validity in population of n + 1 trials Assumptions: Perfect Discrimination Assumption ) 19 / 37
20 α: single-cue predictive accuracy measured by leave-one-out validation Size of training sample α = 1 2 ( Cue Covariation v ( 1 + v 1 n + 1 ) + ρ v : maximum cue validity in population of n + 1 trials Assumptions: Perfect Discrimination Assumption when v v = 1 n + 1 and v otherwise. ) 20 / 37
21 Approximate Single-Cue Predictive Accuracy as a function of size of training sample (in 19 data sets) Accuracy (% CORRECT) v* =.82 =.01 n = 1/2 o(o+1) =.75 - (.41/n+1) Single Cue Predicted Accuracy (theory) Take The Best (observed) Naive Bayes (observed) Number of Objects in Training Sample (o) (Katsikopoulos, Wheeler and Şimşek, 2014 tr) 21 / 37
22 single variable decision rules A Brunswikian Question Under what environmental conditions do single reason rules perform well? high ρ? low ρ? some other structural feature? Egon Brunswik 22 / 37
23 single variable decision rules A Brunswikian Question Under what environmental conditions do single reason rules perform well? 23 / 37
24 single variable decision rules A Brunswikian Question Under what environmental conditions do single reason rules perform well? Cues are highly intercorrelated (Hogarth & Karelaia 2005) - average pairwise cue correlation ρ Xi X j Cues are independent (Baucells, Carrasco & Hogarth 2008) Cues are conditionally independent (Katsikopoulos & Martignon 2006) 24 / 37
25 single variable decision rules A Brunswikian Question Under what environmental conditions do single reason rules perform well? Cues are highly intercorrelated (Hogarth & Karelaia 2005) - average pairwise cue correlation ρ Xi X j Cues are independent (Baucells, Carrasco & Hogarth 2008) Cues are conditionally independent (Katsikopoulos & Martignon 2006) 25 / 37
26 central idea of focused correlation Cov[X 1,..., X n C = c] Cov[X 1,..., X n ] 26 / 37
27 central idea of focused correlation exp(cov[x 1,..., X n C = c] Cov[X 1,..., X n ]) 27 / 37
28 central idea of focused correlation exp(cov[x 1,..., X n C = c] Cov[X 1,..., X n ]) & Let all RVs be indicator functions 28 / 37
29 central idea of focused correlation exp(cov[x 1,..., X n C = c] Cov[X 1,..., X n ]) & Let all RVs be indicator functions For c (x 1,..., x n ) := Pr(x 1,..., x n c) Pr(x 1 c) Pr(x n c) Pr(x 1,..., x n ) Pr(x 1 ) Pr(x n ) 29 / 37
30 single-cue accuracy as a function of criterion predictability and focused correlation v 1 = Criterion predictability Pr(C = c X 1 = c, X 2 = x 2,..., X k = x k ) FOR c,x 2,...,x C (X 1 = c, X 2 = x 2,..., X k = x k ) k Pr(X 1 = c) Pr(X 2 = x 2 ) Pr(X k = x k ) 30 / 37
31 single-cue accuracy as a function of criterion predictability and focused correlation v 1 = Criterion predictability Pr(C = c X 1 = c, X 2 = x 2,..., X k = x k ) FOR c,x 2,...,x C (X 1 = c, X 2 = x 2,..., X k = x k ) k Pr(X 1 = c) Pr(X 2 = x 2 ) Pr(X k = x k ) Result: single cue accuracy increases when the ratio of criterion predictability to focused cue correlation increases 31 / 37
32 solving the puzzle Cues should be dependent but conditionally independent given the criterion Cues should be independent but conditionally dependent given the criterion X 1 X 1 C X 1 6?X 2 X 1?X 2 C C X 1?X 2 X 1 6?X 2 C X 2 X 2 32 / 37
33 solving the puzzle Result 2 v 1 = Pr(C = c X 1 = c, X 2 = x 2,..., X k = x k ) FOR c,x C (X 1 = c, X 2 = x 2,..., X k = x k ) 2,...,c k Pr(X 1 = c) Pr(X 2 = x 2 ) Pr(X k = x k ) X 1 X 1 C X 1 6?X 2 X 1?X 2 C C X 1?X 2 X 1 6?X 2 C X 2 X 2 33 / 37
34 Result 2 v 1 = Pr(C = c X 1 = c, X 2 = x 2,..., X k = x k )... Pr(x 1,..., x n c) c,x 2,...,c k Pr(x 1 c) Pr(x n c) N Pr(x 1,..., x n ) D Pr(x 1 ) Pr(x n ) X 1 X 1 C X 1 6?X 2 X 1?X 2 C C X 1?X 2 X 1 6?X 2 C X 2 X 2 34 / 37
35 resolving a discontinuity c x 2 0 x 1 x 2 C C C X 1 X 2 X 1 X 2 X 1 X 2 P (X 2 )=P (X2) 0 or P (C X 1 )=P (C X 2 ) (A2) [ P (C X 1 )=P (C X2) 0 35 / 37
36 adaptive epistemic norms { lousy for total evidence Conditional independence: good for single cue cond independent cues independent cues Robustness of single cue: deflationary focused corr X inflationary focused corr Total evidence coherence: { inflationary focused corr Final Remarks: - Coherentism and Heuristics are complementary - Adaptive Epistemology 36 / 37
37 key references Baucells, M., JA Carrasco, and R Hogarth (2008): Cumulative Dominance and Heuristic Performance in Binary Multi-attribute Choice, Operations Research, 56: Bovens, L. and S. Hartmann (2003). Bayesian Epistemology, Oxford Univ Press. Olsson, E. (2005). Against Coherence, Oxford University Press. Hogarth, R. and N. Karelaia (2005). Ignoring Information in Binary Choice with Continuous Variables: When is less more? Journal of Mathematical Psychology, 49: Katsikopoulos, K and L Martignon (2006): Naïve Heuristics for Paired Comparison: Some results on their relative accuracy, Journal of Mathematical Psychology 50: Katsikopoulos, K., L. Schooler, and R. Hertwig (2010). The Robust Beauty of Ordinary Information, Psychological Review, 117(4): Schlosshauer, M. and G. Wheeler (2011). Focused Correlation and the Jigsaw Puzzle of Variable Evidence, Philosophy of Science, 78(3): Wheeler G., and Scheines, R. (2013). Coherence and Confirmation Through Causation, Mind, 122(435): Wheeler, G. (2009). Focused Correlation and Confirmation, The British Journal for the Philosophy of Science, 60(1): Wheeler G., (2012). Explaining the Limits of Olsson s Impossibility Result, The Southern Journal of Philosophy, 50(1): / 37
How to Learn Good Cue Orders: When Social Learning Benefits Simple Heuristics
How to Learn Good Cue Orders: When Social Learning Benefits Simple Heuristics Rocio Garcia-Retamero ([email protected]) Center for Adaptive Behavior and Cognition, Max Plank Institute for Human
The less-is-more effect: Predictions and tests
Judgment and Decision Making, Vol. 5, No. 4, July 2010, pp. 244-257 The less-is-more effect: Predictions and tests Konstantinos V. Katsikopoulos Max Planck Institute for Human Development Abstract In inductive
When does ignorance make us smart? Additional factors guiding heuristic inference.
When does ignorance make us smart? Additional factors guiding heuristic inference. C. Philip Beaman ([email protected]) Rachel McCloy ([email protected]) Philip T. Smith ([email protected])
Heuristic Decision Making
Heuristic Decision Making by Elke Kurz-Milcke and Gerd Gigerenzer The study of heuristics analyzes how people make decisions when optimization is out of reach. It focuses on two questions, the first descriptive,
Reasoning the Fast and Frugal Way: Models of Bounded Rationality
Psychological Review Copyright 1996 by the American Psychological Association, Inc. 1996, Vol. 103. No. 4, 650-669 0033-295X/96/$3.00 Reasoning the Fast and Frugal Way: Models of Bounded Rationality Gerd
How To Understand The Reason For A Biased Mind
Topics in Cognitive Science 1 (2009) 107 143 Copyright Ó 2009 Cognitive Science Society, Inc. All rights reserved. ISSN: 1756-8757 print / 1756-8765 online DOI: 10.1111/j.1756-8765.2008.01006.x Homo Heuristicus:
Classification Problems
Classification Read Chapter 4 in the text by Bishop, except omit Sections 4.1.6, 4.1.7, 4.2.4, 4.3.3, 4.3.5, 4.3.6, 4.4, and 4.5. Also, review sections 1.5.1, 1.5.2, 1.5.3, and 1.5.4. Classification Problems
A Bayesian Antidote Against Strategy Sprawl
A Bayesian Antidote Against Strategy Sprawl Benjamin Scheibehenne ([email protected]) University of Basel, Missionsstrasse 62a 4055 Basel, Switzerland & Jörg Rieskamp ([email protected])
Microsoft Azure Machine learning Algorithms
Microsoft Azure Machine learning Algorithms Tomaž KAŠTRUN @tomaz_tsql [email protected] http://tomaztsql.wordpress.com Our Sponsors Speaker info https://tomaztsql.wordpress.com Agenda Focus on explanation
Data Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
Linear Classification. Volker Tresp Summer 2015
Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong
E3: PROBABILITY AND STATISTICS lecture notes
E3: PROBABILITY AND STATISTICS lecture notes 2 Contents 1 PROBABILITY THEORY 7 1.1 Experiments and random events............................ 7 1.2 Certain event. Impossible event............................
Point Biserial Correlation Tests
Chapter 807 Point Biserial Correlation Tests Introduction The point biserial correlation coefficient (ρ in this chapter) is the product-moment correlation calculated between a continuous random variable
Statistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
Integer Programming: Algorithms - 3
Week 9 Integer Programming: Algorithms - 3 OPR 992 Applied Mathematical Programming OPR 992 - Applied Mathematical Programming - p. 1/12 Dantzig-Wolfe Reformulation Example Strength of the Linear Programming
Question 2 Naïve Bayes (16 points)
Question 2 Naïve Bayes (16 points) About 2/3 of your email is spam so you downloaded an open source spam filter based on word occurrences that uses the Naive Bayes classifier. Assume you collected the
Models of Ecological Rationality: The Recognition Heuristic
Psychological Review Copyright 2002 by the American Psychological Association, Inc. 2002, Vol. 109, No. 1, 75 90 0033-295X/02/$5.00 DOI: 10.1037//0033-295X.109.1.75 Models of Ecological Rationality: The
Fast and frugal forecasting
International Journal of Forecasting ( ) www.elsevier.com/locate/ijforecast Fast and frugal forecasting Daniel G. Goldstein a,, Gerd Gigerenzer b a London Business School, UK b Max Planck Institute for
Pearson's Correlation Tests
Chapter 800 Pearson's Correlation Tests Introduction The correlation coefficient, ρ (rho), is a popular statistic for describing the strength of the relationship between two variables. The correlation
Factoring the human decision-making limitations in mobile crowdsensing
Factoring the human decision-making limitations in mobile crowdsensing Merkouris Karaliopoulos, Leonidas Spiliopoulos Iordanis Koutsopoulos Department of Informatics, Athens University of Economics & Business,
Cross-Validation. Synonyms Rotation estimation
Comp. by: BVijayalakshmiGalleys0000875816 Date:6/11/08 Time:19:52:53 Stage:First Proof C PAYAM REFAEILZADEH, LEI TANG, HUAN LIU Arizona State University Synonyms Rotation estimation Definition is a statistical
Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05
Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification
Chapter 6. The stacking ensemble approach
82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described
An Introduction to Machine Learning
An Introduction to Machine Learning L5: Novelty Detection and Regression Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia [email protected] Tata Institute, Pune,
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
Statistical Rules of Thumb
Statistical Rules of Thumb Second Edition Gerald van Belle University of Washington Department of Biostatistics and Department of Environmental and Occupational Health Sciences Seattle, WA WILEY AJOHN
Part 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
MACHINE LEARNING IN HIGH ENERGY PHYSICS
MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!
CHAPTER 8. SUBJECTIVE PROBABILITY
CHAPTER 8. SUBJECTIVE PROBABILITY Frequentist interpretation of probability: Probability = Relative frequency of occurrence of an event Frequentist Definition requires one to specify a repeatable experiment.
Likelihood Approaches for Trial Designs in Early Phase Oncology
Likelihood Approaches for Trial Designs in Early Phase Oncology Clinical Trials Elizabeth Garrett-Mayer, PhD Cody Chiuzan, PhD Hollings Cancer Center Department of Public Health Sciences Medical University
Probability Calculator
Chapter 95 Introduction Most statisticians have a set of probability tables that they refer to in doing their statistical wor. This procedure provides you with a set of electronic statistical tables that
1 Maximum likelihood estimation
COS 424: Interacting with Data Lecturer: David Blei Lecture #4 Scribes: Wei Ho, Michael Ye February 14, 2008 1 Maximum likelihood estimation 1.1 MLE of a Bernoulli random variable (coin flips) Given N
Organizing Your Approach to a Data Analysis
Biost/Stat 578 B: Data Analysis Emerson, September 29, 2003 Handout #1 Organizing Your Approach to a Data Analysis The general theme should be to maximize thinking about the data analysis and to minimize
Introduction to mixed model and missing data issues in longitudinal studies
Introduction to mixed model and missing data issues in longitudinal studies Hélène Jacqmin-Gadda INSERM, U897, Bordeaux, France Inserm workshop, St Raphael Outline of the talk I Introduction Mixed models
Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13
Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13 Overview Missingness and impact on statistical analysis Missing data assumptions/mechanisms Conventional
D-optimal plans in observational studies
D-optimal plans in observational studies Constanze Pumplün Stefan Rüping Katharina Morik Claus Weihs October 11, 2005 Abstract This paper investigates the use of Design of Experiments in observational
Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in
Knowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Unit # 10 Sajjad Haider Fall 2012 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right
Using MS Excel to Analyze Data: A Tutorial
Using MS Excel to Analyze Data: A Tutorial Various data analysis tools are available and some of them are free. Because using data to improve assessment and instruction primarily involves descriptive and
APPLICATION OF DATA MINING TECHNIQUES FOR DIRECT MARKETING. Anatoli Nachev
86 ITHEA APPLICATION OF DATA MINING TECHNIQUES FOR DIRECT MARKETING Anatoli Nachev Abstract: This paper presents a case study of data mining modeling techniques for direct marketing. It focuses to three
Continued Fractions and the Euclidean Algorithm
Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction
Cross Validation. Dr. Thomas Jensen Expedia.com
Cross Validation Dr. Thomas Jensen Expedia.com About Me PhD from ETH Used to be a statistician at Link, now Senior Business Analyst at Expedia Manage a database with 720,000 Hotels that are not on contract
CCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York
BME I5100: Biomedical Signal Processing Linear Discrimination Lucas C. Parra Biomedical Engineering Department CCNY 1 Schedule Week 1: Introduction Linear, stationary, normal - the stuff biology is not
psychology and economics
psychology and economics lecture 9: biases in statistical reasoning tomasz strzalecki failures of Bayesian updating how people fail to update in a Bayesian way how Bayes law fails to describe how people
large-scale machine learning revisited Léon Bottou Microsoft Research (NYC)
large-scale machine learning revisited Léon Bottou Microsoft Research (NYC) 1 three frequent ideas in machine learning. independent and identically distributed data This experimental paradigm has driven
CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht [email protected] 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht [email protected] 539 Sennott
Average Redistributional Effects. IFAI/IZA Conference on Labor Market Policy Evaluation
Average Redistributional Effects IFAI/IZA Conference on Labor Market Policy Evaluation Geert Ridder, Department of Economics, University of Southern California. October 10, 2006 1 Motivation Most papers
Lecture 8 The Subjective Theory of Betting on Theories
Lecture 8 The Subjective Theory of Betting on Theories Patrick Maher Philosophy 517 Spring 2007 Introduction The subjective theory of probability holds that the laws of probability are laws that rational
Is a Single-Bladed Knife Enough to Dissect Human Cognition? Commentary on Griffiths et al.
Cognitive Science 32 (2008) 155 161 Copyright C 2008 Cognitive Science Society, Inc. All rights reserved. ISSN: 0364-0213 print / 1551-6709 online DOI: 10.1080/03640210701802113 Is a Single-Bladed Knife
MULTIVARIATE PROBABILITY DISTRIBUTIONS
MULTIVARIATE PROBABILITY DISTRIBUTIONS. PRELIMINARIES.. Example. Consider an experiment that consists of tossing a die and a coin at the same time. We can consider a number of random variables defined
Predict Influencers in the Social Network
Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, [email protected] Department of Electrical Engineering, Stanford University Abstract Given two persons
PROPERTIES OF THE SAMPLE CORRELATION OF THE BIVARIATE LOGNORMAL DISTRIBUTION
PROPERTIES OF THE SAMPLE CORRELATION OF THE BIVARIATE LOGNORMAL DISTRIBUTION Chin-Diew Lai, Department of Statistics, Massey University, New Zealand John C W Rayner, School of Mathematics and Applied Statistics,
Section 3 Part 1. Relationships between two numerical variables
Section 3 Part 1 Relationships between two numerical variables 1 Relationship between two variables The summary statistics covered in the previous lessons are appropriate for describing a single variable.
Data Mining Lab 5: Introduction to Neural Networks
Data Mining Lab 5: Introduction to Neural Networks 1 Introduction In this lab we are going to have a look at some very basic neural networks on a new data set which relates various covariates about cheese
Multiple Linear Regression in Data Mining
Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple
STA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! [email protected]! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
T-test & factor analysis
Parametric tests T-test & factor analysis Better than non parametric tests Stringent assumptions More strings attached Assumes population distribution of sample is normal Major problem Alternatives Continue
The Binomial Distribution
The Binomial Distribution James H. Steiger November 10, 00 1 Topics for this Module 1. The Binomial Process. The Binomial Random Variable. The Binomial Distribution (a) Computing the Binomial pdf (b) Computing
The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data
The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data John Tukey Statistical learning in language acquisition
Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví. Pavel Kříž. Seminář z aktuárských věd MFF 4.
Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví Pavel Kříž Seminář z aktuárských věd MFF 4. dubna 2014 Summary 1. Application areas of Insurance Analytics 2. Insurance Analytics
5. Multiple regression
5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful
Supervised Feature Selection & Unsupervised Dimensionality Reduction
Supervised Feature Selection & Unsupervised Dimensionality Reduction Feature Subset Selection Supervised: class labels are given Select a subset of the problem features Why? Redundant features much or
Regression 3: Logistic Regression
Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic regression Logistic regression in R Outline Logistic regression Introduction The model Looking at and comparing
Nonparametric adaptive age replacement with a one-cycle criterion
Nonparametric adaptive age replacement with a one-cycle criterion P. Coolen-Schrijner, F.P.A. Coolen Department of Mathematical Sciences University of Durham, Durham, DH1 3LE, UK e-mail: [email protected]
Why High-Order Polynomials Should Not be Used in Regression Discontinuity Designs
Why High-Order Polynomials Should Not be Used in Regression Discontinuity Designs Andrew Gelman Guido Imbens 2 Aug 2014 Abstract It is common in regression discontinuity analysis to control for high order
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma [email protected] The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association
A semi-supervised Spam mail detector
A semi-supervised Spam mail detector Bernhard Pfahringer Department of Computer Science, University of Waikato, Hamilton, New Zealand Abstract. This document describes a novel semi-supervised approach
Research on the Factor Analysis and Logistic Regression with the Applications on the Listed Company Financial Modeling.
2nd International Conference on Social Science and Technology Education (ICSSTE 2016) Research on the Factor Analysis and Logistic Regression with the Applications on the Listed Company Financial Modeling
Comparison of machine learning methods for intelligent tutoring systems
Comparison of machine learning methods for intelligent tutoring systems Wilhelmiina Hämäläinen 1 and Mikko Vinni 1 Department of Computer Science, University of Joensuu, P.O. Box 111, FI-80101 Joensuu
Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: [email protected] Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
QUALITY ENGINEERING PROGRAM
QUALITY ENGINEERING PROGRAM Production engineering deals with the practical engineering problems that occur in manufacturing planning, manufacturing processes and in the integration of the facilities and
Reject Inference in Credit Scoring. Jie-Men Mok
Reject Inference in Credit Scoring Jie-Men Mok BMI paper January 2009 ii Preface In the Master programme of Business Mathematics and Informatics (BMI), it is required to perform research on a business
Chapter 8 Subjective Probability
Making Hard Decisions Chapter 8 Subjective Probability Slide 1 of 45 A Subjective Interpretation Frequentist interpretation of probability: Probability = Relative frequency of occurrence of an event Frequentist
Overview of Non-Parametric Statistics PRESENTER: ELAINE EISENBEISZ OWNER AND PRINCIPAL, OMEGA STATISTICS
Overview of Non-Parametric Statistics PRESENTER: ELAINE EISENBEISZ OWNER AND PRINCIPAL, OMEGA STATISTICS About Omega Statistics Private practice consultancy based in Southern California, Medical and Clinical
LCs for Binary Classification
Linear Classifiers A linear classifier is a classifier such that classification is performed by a dot product beteen the to vectors representing the document and the category, respectively. Therefore it
Penalized regression: Introduction
Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood
JUDGMENT AS A COMPONENT DECISION PROCESS FOR CHOOSING BETWEEN SEQUENTIALLY AVAILABLE ALTERNATIVES
New Directions in Research on Decision Making B. Brehmer, H. Jungermann, P. Lourens, and G. Sevo'n (Editors) Elsevier Science Publishers B.V. (North-Holland), 1986 JUDGMENT AS A COMPONENT DECISION PROCESS
Machine Learning Final Project Spam Email Filtering
Machine Learning Final Project Spam Email Filtering March 2013 Shahar Yifrah Guy Lev Table of Content 1. OVERVIEW... 3 2. DATASET... 3 2.1 SOURCE... 3 2.2 CREATION OF TRAINING AND TEST SETS... 4 2.3 FEATURE
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
Introduction to Support Vector Machines. Colin Campbell, Bristol University
Introduction to Support Vector Machines Colin Campbell, Bristol University 1 Outline of talk. Part 1. An Introduction to SVMs 1.1. SVMs for binary classification. 1.2. Soft margins and multi-class classification.
Predictive Modeling and Big Data
Predictive Modeling and Presented by Eileen Burns, FSA, MAAA Milliman Agenda Current uses of predictive modeling in the life insurance industry Potential applications of 2 1 June 16, 2014 [Enter presentation
Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom
Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1 Learning Goals 1. Be able to explain the difference between the p-value and a posterior
A Bayesian hierarchical surrogate outcome model for multiple sclerosis
A Bayesian hierarchical surrogate outcome model for multiple sclerosis 3 rd Annual ASA New Jersey Chapter / Bayer Statistics Workshop David Ohlssen (Novartis), Luca Pozzi and Heinz Schmidli (Novartis)
Lasso on Categorical Data
Lasso on Categorical Data Yunjin Choi, Rina Park, Michael Seo December 14, 2012 1 Introduction In social science studies, the variables of interest are often categorical, such as race, gender, and nationality.
arxiv:1112.0829v1 [math.pr] 5 Dec 2011
How Not to Win a Million Dollars: A Counterexample to a Conjecture of L. Breiman Thomas P. Hayes arxiv:1112.0829v1 [math.pr] 5 Dec 2011 Abstract Consider a gambling game in which we are allowed to repeatedly
Simple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
WORKED EXAMPLES 1 TOTAL PROBABILITY AND BAYES THEOREM
WORKED EXAMPLES 1 TOTAL PROBABILITY AND BAYES THEOREM EXAMPLE 1. A biased coin (with probability of obtaining a Head equal to p > 0) is tossed repeatedly and independently until the first head is observed.
Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2
Lesson 4 Part 1 Relationships between two numerical variables 1 Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables
When Betting Odds and Credences Come Apart: More Worries for Dutch Book Arguments
When Betting Odds and Credences Come Apart: More Worries for Dutch Book Arguments Darren BRADLEY and Hannes LEITGEB If an agent believes that the probability of E being true is 1/2, should she accept a
5. Linear Regression
5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4
200628 - DAIC - Advanced Experimental Design in Clinical Research
Coordinating unit: Teaching unit: Academic year: Degree: ECTS credits: 2015 200 - FME - School of Mathematics and Statistics 1004 - UB - (ENG)Universitat de Barcelona MASTER'S DEGREE IN STATISTICS AND
Christfried Webers. Canberra February June 2015
c Statistical Group and College of Engineering and Computer Science Canberra February June (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 829 c Part VIII Linear Classification 2 Logistic
Deliberation versus automaticity in decision making: Which presentation format features facilitate automatic decision making?
Judgment and Decision Making, Vol. 8, No. 3, May 2013, pp. 278 298 Deliberation versus automaticity in decision making: Which presentation format features facilitate automatic decision making? Anke Söllner
