Incorporating cost in Bayesian Variable Selection, with application to costeffective measurement of quality of health care.


 Gordon Dean
 2 years ago
 Views:
Transcription
1 Incorporating cost in Bayesian Variable Selection, with application to costeffective measurement of quality of health care University of Florida 10th Annual Winter Workshop: Bayesian Model Selection and Objective Methods 2 Synopsis Dimitris Fouskakis, Department of Mathematics, School of Applied Mathematical and Physical Sciences, National Technical University of Athens, Athens, Greece; Joint work with: Ioannis Ntzoufras & David Draper Department of Statistics Department of Applied Mathematics and Statistics Athens University of Economics and Business University of California Athens, Greece; Santa Cruz, USA; 1. Motivation  Indirect Measurement of Quality of Health Care. 2. Model Specification. 3. Cost  Benefit Analysis. 4. Cost  Restriction  Benefit Analysis. 5. Discussion. Presentation is available at: fouskakis/conferences/bms/bms.pdf. University of Florida 10th Annual Winter Workshop: Bayesian Model Selection and Objective Methods 3 1 Motivation  Indirect Measurement of Quality of Health Care How to measure hospital quality of care? Indirect method: inputoutput approach hospital outcomes (e.g., mortality within 30 days of admission) compared after adjusting for differences in inputs (sickness at admission). Patient sickness at admission is traditionally assessed by using logistic regression of mortality within 30 days of admission on a fairly large number of sickness indicators (on the order of 100) to construct a sickness scale. Benefit  Only Analysis : Classical variable selection techniques can be employed to find an optimal subset of indicators. In a major U.S. study constructed by RAND Corporation, such approach was used to reduced the initial list of p = 83 sickness indicators gathered on n =2, 532 pneumonia patients down to a core of 14 predictors (Keeler, et al., 1990). University of Florida 10th Annual Winter Workshop: Bayesian Model Selection and Objective Methods 4 The 14Variable Rand Pneumonia Scale The RAND admission sickness scale for pneumonia (p = 14 variables), with the marginal data collection costs per patient for each variable (in minutes of abstraction time). Variable Cost Variable Cost (Minutes) (Minutes) Blood Urea Nitrogen 1.50 Age 0.50 Systolic Blood Pressure 0.50 Chest Xray Congestive 2.50 Score (2point scale) Heart Failure Score (3point scale) Total APACHE II Score APACHE II Coma Score 2.50 (36point scale) (3point scale) Serum Albumin 1.50 Shortness of Breath 1.00 (3point scale) Day 1 Respiratory Distress 1.00 Septic Complications 3.00 Prior Respiratory Failure 2.00 Recently Hospitalized 2.00 Ambulatory Score 2.50 Initial Temperature 0.50 (3point scale)
2 University of Florida 10th Annual Winter Workshop: Bayesian Model Selection and Objective Methods 5 2 Model Specification Logistic regression model with Y i = 1 if patient i dies after 30 days of admission. X ij : j sickness predictor variable for the i patient. m γ =(γ 1,...,γ p ) T. γ j : Binary indicators of the inclusion of the variable X j in the model. Model space M = {0, 1} p ; p = total number of variables considered. Hence the model formulation can be summarized as indep (Y i γ) Bernoulli(p i (γ)), ( ) pi (γ) η i (γ) = log = β j γ j X ij, 1 p i (γ) j=0 η(γ) = X diag(γ) β = Xγ βγ. University of Florida 10th Annual Winter Workshop: Bayesian Model Selection and Objective Methods 6 Two different approaches The RAND Benefit  Only approach is suboptimal: it does not consider differences in cost of data collection among available predictors. We propose a Cost  Benefit Analysis, in which variables are chosen only when they predict well enough given how much they cost to collect. In problems such as this, in which there are two desirable criteria that compete, and over which a joint optimization must be achieved, there are two main ways to proceed: Both criteria can be placed on a common scale, and optimization can occur on that scale (strategy (a)). One criterion can be optimized, subject to a bound on the other (strategy (b)). University of Florida 10th Annual Winter Workshop: Bayesian Model Selection and Objective Methods 7 Three methods for solving this problem (1) (strategy (a)) Draper and Fouskakis (2000) and Fouskakis and Draper (2002, 2008) proposed an approach to this problem based on Bayesian Decision Theory. They used stochastic optimization methods to find (near) optimal subsets of predictor variables that maximize an expected utility function which trades off data collection cost against predictive accuracy. (2) (strategy (a)) In this work, as an alternative to (1), we propose a prior distribution that accounts for the cost of each variable and results in a set of posterior model probabilities which correspond to a Generalized CostAdjusted version of the Bayesian Information Criterion (Fouskakis, Ntzoufras and Draper, 2007a). (3) (strategy (b)) We also implement a Cost  Restriction  Benefit Analysis, where the search is conducted only among models whose cost does not exceed a budgetary restriction (Fouskakis, Ntzoufras and Draper, 2007b), by the usage of a Population  Based Trans  Dimensional RJMCMC Method. Here we present results from methods (2) (Cost  Benefit Analysis) and (3) (Cost  Restriction  Benefit Analysis). University of Florida 10th Annual Winter Workshop: Bayesian Model Selection and Objective Methods 8 3 CostBenefit Analysis The aim is to identify well fitted models after taking into account the cost of each variable. Therefore we need to estimate the posterior model probability f(γ) f(y βγ, γ)f(βγ γ)dβγ f(γ y) = f(γ ) f(y βγ, γ )f(βγ γ γ )dβγ {0,1} p after introducing a prior on model space f(γ) depending on the cost. Prior on Model Parameters ( ( ) ) 1 f(βγ γ) =Normal 0, 4n X T γxγ Low Information Prior, since it gives weight to the prior equal to one datapoint (see Ntzoufras, Delaportas and Forster, 2003).
3 University of Florida 10th Annual Winter Workshop: Bayesian Model Selection and Objective Methods 9 A Costpenalized Prior on Model Space University of Florida 10th Annual Winter Workshop: Bayesian Model Selection and Objective Methods 10 Approximations of the Posterior Model Odds ( γj f(γ j ) exp 2 c 0 c j c 0 ) log n for j =1,...,p. When comparing models γ (k) and γ (l) penalty imposed to the loglikelihood ratio is given by 2 log f(γ(k) ) f(γ (l) ) = ( γ (k) j c j : cost per observation for X j variable. ) γ (l) cj ) j log n c (dγ d (k) γ log n. (l) 0 c 0 : baseline cost (default choice: c 0 = min{c j } j =1,...,p). Indifference concerning the cost c j = c 0 for j =1,...,p uniform prior on model space (f(γ) 1) Posterior model odds = Bayes factor. Using Laplace approximation in our model formulation we end up 2 log f(γ y) = 2 log f(y βγ, γ)+φ(γ) } prior model prob. { }} { 2 log f(γ) +O(n 1 ). {{ } Penalty Term with φ(γ) = β γ : posterior mode of f(β γ y, γ), dγ = p γj is the dimension of the model γ, 1 4n β T γx T γxγ βγ Ψ 1 γ + dγ log(4n) + log X T γxγ } {{ } can be thought a measure of discrepancy between the data and the prior information of the model parameters Ψγ is minus the inverse of the Hessian matrix of h(βγ ) = log f(y βγ, γ) + log f(βγ γ) evaluated at the posterior mode βγ.. University of Florida 10th Annual Winter Workshop: Bayesian Model Selection and Objective Methods 11 University of Florida 10th Annual Winter Workshop: Bayesian Model Selection and Objective Methods 12 Penalty Interpretation: A generalized costadjusted BIC Implementation and Results 2 log f(γ y) = 2 log f(y ˆβγ)+ = 2 log f(y ˆβγ)+ C γ c 0 Cγ = p γ jc j, the cost of model γ. ˆβγ = MLE of the parameters βγ of model γ. If c j = c 0 for all j BIC = 2 log f(y ˆβγ)+dγ log n. γ j c j c 0 log n + O(1) log n + O(1). Run RJMCMC (Green, 1995) for 100K iterations in the full model space. Eliminate nonimportant variables (with marginal probabilities < 0.30) forming a new reduced model space. Run RJMCMC for 100K iterations in the reduced model space to estimate posterior model odds and best models. Two setups: 1. Benefit only analysis (uniform prior on model space). 2. Cost  Benefit Analysis (cost penalized prior on model space).
4 University of Florida 10th Annual Winter Workshop: Bayesian Model Selection and Objective Methods 13 Preliminary Results: Marginal Probabilities f(γ j =1 y) University of Florida 10th Annual Winter Workshop: Bayesian Model Selection and Objective Methods 14 Reduced Model Space: Posterior Model Probabilities/Odds Variable Benefit CostBenefit Index Name Cost Analysis Analysis 1 Systolic Blood Pressure (SBP) Score Age Blood Urea Nitrogen Apache II Coma Score Shortness of Breath Day Septic Complications Initial Temperature Heart Rate Day Chest Pain Day Cardiomegaly Score Hematologic History Score Apache Respiratory Rate Score Admission SBP Respiratory Rate Day Confusion Day Apache ph Score Morbid + Comorbid Score Musculoskeletal Score Common variables in both analyses: X 1 + X 2 + X 3 + X 5 + X 12 + X 70 BenefitOnly Analysis Common Variables Additional Model Posterior k Within Each Analysis Variables Cost Probabilities PO 1k 1 X 4 + X 15 + X 37 + X 73 +X 8 +X 27 +X X 8 +X X X 27 +X CostBenefit Analysis Common Variables Additional Model Posterior k Within Each Analysis Variables Cost Probabilities PO 1k 1 X 46 + X 51 +X 49 +X X 14 +X 49 +X X 13 +X 49 +X X 13 +X 14 +X 49 +X X 14 +X X X 37 +X X 13 +X 14 +X X above 3%. posterior odds of the best model within each analysis versus the current model k. University of Florida 10th Annual Winter Workshop: Bayesian Model Selection and Objective Methods 15 University of Florida 10th Annual Winter Workshop: Bayesian Model Selection and Objective Methods 16 Reduced Model Space: Comparisons Comparison of measures of fit, cost and dimensionality between the best models in the reduced model space of the benefitonly and costbenefit analysis; percentage difference is in relation to benefitonly. Analysis Difference BenefitOnly CostBenefit (%) Minimum Deviance Median Deviance Cost Dimension Cost Restriction  Benefit Analysis Implement a Cost  Restriction  Benefit Analysis, in which the practical relevance of the selected variable subsets is ensured by enforcing an overall limit on the total data collection cost of each subset: the search is conducted only among models whose cost does not exceed this budgetary restriction C. Therefore, we should apriori exclude models γ with total cost larger than C, resulting to a significantly reduced model space, M = {γ {0, 1} p : c i γ i C}. AIM: Estimate posterior model probabilities in the cost restricted model space. i=1
5 University of Florida 10th Annual Winter Workshop: Bayesian Model Selection and Objective Methods 17 University of Florida 10th Annual Winter Workshop: Bayesian Model Selection and Objective Methods 18 PROBLEM: Due to the cost limit, model space areas of local maximum exist. Thus, we need to change the definition of the neighborhood structure of the proposed models and construct more advanced proposed jumps possibly between models of the same cost in order to avoid getting trapped into local maxima. SOLUTION: Intelligent transdimension MCMC methods that allow to move across areas of local maximum even if these are distinct. Proposed Algorithm We have developed a Population Based TransDimensional ReversibleJump Markov Chain Monte Carlo algorithm (Population RJMCMC), combining ideas from the PopulationBased MCMC (Jasra, Stephens and Holmes, 2007) and Simulated Tempering (Geyer and Thompson, 1995) algorithms. Population RJMCMC Use 3 chains: The actual one, plus two auxiliary ones. In the auxiliary chains the posterior distributions are raised in a power t k (temperature), k =1, 2. 1st auxiliary chain: t 1 > 1 increasing differences between the posterior probabilities (makes the distribution steeper allowing by this way the MCMC to move closer to locally best models). 2nd auxiliary chain: 0<t 2 < 1 reducing differences between the posterior probabilities (makes the distribution flatter allowing by this way the MCMC to move easily across different models). Temperatures t k change stochastically. By this way the extensive number of chains is avoided. University of Florida 10th Annual Winter Workshop: Bayesian Model Selection and Objective Methods 19 The incorporation of stochastic temperatures can be done using pseudo priors g k (t k ). In this case the posterior distribution will be expanded to { f(β, γ, β (k), γ (k),t 1,t 2 y) f(y β, γ)f(β γ)f(γ) } 2 k=1 { f(y β (k), γ (k) )f(β (k) γ (k) )f(γ (k) )} tk g k (t k ), where γ (k) and β (k) are the model indicator and parameter vector of chain k. Model indicators and parameters can be updated using RJMCMC steps, while the temperature t k can be generated from the conditional posterior distribution f(t k β, γ, β (k), γ (k),t \k, y) { f(y β (k), γ (k) )f(β (k) γ (k) )f(γ (k) ) } t k g k (t k ). University of Florida 10th Annual Winter Workshop: Bayesian Model Selection and Objective Methods 20 Since g k (t k ) are pseudopriors, we can set g k (t k ) h k(t k ) Z k (y,t k ) where h k (t k ) are convenient and easy to simulate from density functions resulting to For the selection of h k (t k ) we propose to use f(t k y) =h k (t k ). h 1(t 1)=Gamma(t 1 1; a 2,b 2) and h 2(t 2)=Beta(t 2; a 1,b 1). Prior Distributions The desired posterior marginal distribution for the temperatures t k is given by ( f(t k y) f(y tk, β (k), γ (k) )f(β (k) γ (k) )f(γ (k) ) ) t k g k (t k )dβ (k) γ (k) M β (k) Z k (y,t k )g k (t k ), where Z k (y,t k ) is the marginal likelihood over all possible models for chain k. Same prior on model parameters as in the Cost  Benefit Analysis and a uniform prior on cost restricted model space, i.e. f(γ) I(γ M: c(γ) = γ jc j C), where c j is the differential cost per observation for variable X j and C is the budgetary restriction.
6 University of Florida 10th Annual Winter Workshop: Bayesian Model Selection and Objective Methods 21 University of Florida 10th Annual Winter Workshop: Bayesian Model Selection and Objective Methods 22 Implementation and Results COST LIMIT: C = 10 minutes of abstraction time. Run Population RJMCMC for 100K iterations in the full model space, twice, starting each time from a different model. Eliminate nonimportant variables (with marginal probabilities < 0.30 in both runs) forming a new reduced model space. Run population RJMCMC in the reduced space, twice. Compare results and performance of population RJMCMC with simple RJMCMC. Preliminary Results: Marginal Probabilities f(γ j =1 y) Variables with marginal posterior probabilities f(γ j =1 y) above 0.30 in at least one run. Marginal Posterior Probabilities Variable First Run Second Run Index Name Cost Analysis Analysis 1 Systolic Blood Pressure (SBP) Score Age Blood Urea Nitrogen Apache II Coma Score Shortness of Breath Day Serum Albumin Initial Temperature Apache Respiratory Rate Score Admission SBP Respiratory Rate Day Confusion Day Body System Count Apache ph Score University of Florida 10th Annual Winter Workshop: Bayesian Model Selection and Objective Methods 23 Reduced Model Space: Posterior Model Probabilities/Odds University of Florida 10th Annual Winter Workshop: Bayesian Model Selection and Objective Methods 24 Reduced Model Space: Monte Carlo Errors Common variables in both analyses: X 2 + X 4 Population RJMCMC  500K iterations 1st Run 2nd Run Common Additional Posterior Posterior k m Variables Variables Prob. PO 1k Prob. PO 1k 1 m 1 X 1 + X 12 + X 37 +X 3 +X 5 +X m 2 +X 5 +X 46 +X 62 +X m 3 +X 3 +X 62 +X m 4 +X 3 +X 5 +X 6 +X Simple RJMCMC  500K iterations 1st Run 2nd Run Common Additional Posterior Posterior k m Variables Variables Prob. PO 1k Prob. PO 1k 1 m 1 X 62 +X 1 +X 3 +X 5 +X 12 +X m 3 +X 1 +X 3 +X 12 +X 37 +X m 2 +X 1 +X 5 +X 12 +X 37 +X 46 +X m 5 +X 3 +X 5 +X 46 +X 49 +X < 0.03 > m 6 +X 1 +X 3 +X 5 +X 49 +X < 0.03 > 19.9 posterior odds of the best model within each analysis versus the current model k. All models appearing in the table have total cost 10 min (cost limit). Monte Carlo Errors (%) RJMCMC Type Run Iterations m 1 m 2 m 3 m 4 POP K POP K POP K POP K POP K POP K SIMPLE 1 500K SIMPLE 2 500K Relative Comparisons SIMPLE vs. POP. 500K (First Run) 200K K SIMPLE vs. POP. 500K (Second Run) 200K K
7 University of Florida 10th Annual Winter Workshop: Bayesian Model Selection and Objective Methods 25 University of Florida 10th Annual Winter Workshop: Bayesian Model Selection and Objective Methods 26 References 5 Discussion Cost  Benefit Analysis: The resulting models achieve dramatic gains in cost and noticeable improvement in model simplicity at the price of a small loss in predictive accuracy, when compared to the results of a more traditional benefitonly analysis. Cost  Restriction  Benefit Analysis: Population RJMCMC algorithm explores the model space efficiently and converges faster than simple RJMCMC (having lower Monte Carlo errors). Draper D, Fouskakis D (2000). A case study of stochastic optimization in health policy: problem formulation and preliminary results. Journal of Global Optimization, 18, Fouskakis D, Draper D (2002). Stochastic optimization: a review. International Statistical Review, 70, Fouskakis D, Draper D (2008). Comparing stochastic optimization methods for variable selection in binary outcome prediction, with application to health policy. Journal of the American Statistical Association, 103, forthcoming. Fouskakis D, Ntzoufras I, Draper D (2007a). Bayesian variable selection using costadjusted BIC, with application to costeffective measurement of quality of health care. (submitted). Fouskakis D, Ntzoufras I, Draper D (2007b). Population Based Reversible Jump MCMC for Bayesian Variable Selection and Evaluation Under Cost Limit Restrictions. (submitted). Geyer CJ, Thomson EA (1995). Annealing Markov Chain Monte Carlo with applications to ancestral inference. Journal of the American Statistical Association, 90, Green P (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika, 82, Jasra A, Stephens DA, Holmes CC (2007). Populationbased reversible jump MCMC. Biometrika. forthcoming. Keeler E, Kahn K, Draper D, Sherwood M, Rubenstein L, Reinisch E, Kosecoff J, Brook R (1990). Changes in sickness at admission following the introduction of the Prospective Payment System. Journal of the American Medical Association, 264, Ntzoufras I, Dellaportas P, Forster JJ (2003). Bayesian variable and link determination for generalized linear models. Journal of Statistical Planning and Inference, 111,
STA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
More informationLogistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
More informationTutorial on Markov Chain Monte Carlo
Tutorial on Markov Chain Monte Carlo Kenneth M. Hanson Los Alamos National Laboratory Presented at the 29 th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Technology,
More informationSTATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 SigmaRestricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More informationParallelization Strategies for Multicore Data Analysis
Parallelization Strategies for Multicore Data Analysis WeiChen Chen 1 Russell Zaretzki 2 1 University of Tennessee, Dept of EEB 2 University of Tennessee, Dept. Statistics, Operations, and Management
More informationMultivariate Logistic Regression
1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation
More informationIntroduction to Markov Chain Monte Carlo
Introduction to Markov Chain Monte Carlo Monte Carlo: sample from a distribution to estimate the distribution to compute max, mean Markov Chain Monte Carlo: sampling using local information Generic problem
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationStatistics in Retail Finance. Chapter 6: Behavioural models
Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics: Behavioural
More informationDetection of changes in variance using binary segmentation and optimal partitioning
Detection of changes in variance using binary segmentation and optimal partitioning Christian Rohrbeck Abstract This work explores the performance of binary segmentation and optimal partitioning in the
More informationMarkov Chain Monte Carlo Simulation Made Simple
Markov Chain Monte Carlo Simulation Made Simple Alastair Smith Department of Politics New York University April2,2003 1 Markov Chain Monte Carlo (MCMC) simualtion is a powerful technique to perform numerical
More informationGeneralized Linear Models. Today: definition of GLM, maximum likelihood estimation. Involves choice of a link function (systematic component)
Generalized Linear Models Last time: definition of exponential family, derivation of mean and variance (memorize) Today: definition of GLM, maximum likelihood estimation Include predictors x i through
More informationBayesX  Software for Bayesian Inference in Structured Additive Regression
BayesX  Software for Bayesian Inference in Structured Additive Regression Thomas Kneib Faculty of Mathematics and Economics, University of Ulm Department of Statistics, LudwigMaximiliansUniversity Munich
More informationLogistic Regression (1/24/13)
STA63/CBB540: Statistical methods in computational biology Logistic Regression (/24/3) Lecturer: Barbara Engelhardt Scribe: Dinesh Manandhar Introduction Logistic regression is model for regression used
More informationA Bootstrap MetropolisHastings Algorithm for Bayesian Analysis of Big Data
A Bootstrap MetropolisHastings Algorithm for Bayesian Analysis of Big Data Faming Liang University of Florida August 9, 2015 Abstract MCMC methods have proven to be a very powerful tool for analyzing
More informationBayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com
Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University caizhua@gmail.com 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian
More informationLocal classification and local likelihoods
Local classification and local likelihoods November 18 knearest neighbors The idea of local regression can be extended to classification as well The simplest way of doing so is called nearest neighbor
More informationMore details on the inputs, functionality, and output can be found below.
Overview: The SMEEACT (Software for More Efficient, Ethical, and Affordable Clinical Trials) web interface (http://research.mdacc.tmc.edu/smeeactweb) implements a single analysis of a twoarmed trial comparing
More informationSection 5. Stan for Big Data. Bob Carpenter. Columbia University
Section 5. Stan for Big Data Bob Carpenter Columbia University Part I Overview Scaling and Evaluation data size (bytes) 1e18 1e15 1e12 1e9 1e6 Big Model and Big Data approach state of the art big model
More informationItem selection by latent classbased methods: an application to nursing homes evaluation
Item selection by latent classbased methods: an application to nursing homes evaluation Francesco Bartolucci, Giorgio E. Montanari, Silvia Pandolfi 1 Department of Economics, Finance and Statistics University
More informationEstimating the evidence for statistical models
Estimating the evidence for statistical models Nial Friel University College Dublin nial.friel@ucd.ie March, 2011 Introduction Bayesian model choice Given data y and competing models: m 1,..., m l, each
More informationImputing Missing Data using SAS
ABSTRACT Paper 32952015 Imputing Missing Data using SAS Christopher Yim, California Polytechnic State University, San Luis Obispo Missing data is an unfortunate reality of statistics. However, there are
More informationLab 8: Introduction to WinBUGS
40.656 Lab 8 008 Lab 8: Introduction to WinBUGS Goals:. Introduce the concepts of Bayesian data analysis.. Learn the basic syntax of WinBUGS. 3. Learn the basics of using WinBUGS in a simple example. Next
More informationCSCI567 Machine Learning (Fall 2014)
CSCI567 Machine Learning (Fall 2014) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu September 22, 2014 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 2014) September 22, 2014 1 /
More informationGaussian Processes to Speed up Hamiltonian Monte Carlo
Gaussian Processes to Speed up Hamiltonian Monte Carlo Matthieu Lê Murray, Iain http://videolectures.net/mlss09uk_murray_mcmc/ Rasmussen, Carl Edward. "Gaussian processes to speed up hybrid Monte Carlo
More informationLinear Classification. Volker Tresp Summer 2015
Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong
More informationProbabilistic Linear Classification: Logistic Regression. Piyush Rai IIT Kanpur
Probabilistic Linear Classification: Logistic Regression Piyush Rai IIT Kanpur Probabilistic Machine Learning (CS772A) Jan 18, 2016 Probabilistic Machine Learning (CS772A) Probabilistic Linear Classification:
More informationPredict Influencers in the Social Network
Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons
More informationAPPLIED MISSING DATA ANALYSIS
APPLIED MISSING DATA ANALYSIS Craig K. Enders Series Editor's Note by Todd D. little THE GUILFORD PRESS New York London Contents 1 An Introduction to Missing Data 1 1.1 Introduction 1 1.2 Chapter Overview
More informationChristfried Webers. Canberra February June 2015
c Statistical Group and College of Engineering and Computer Science Canberra February June (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 829 c Part VIII Linear Classification 2 Logistic
More informationSample Size Designs to Assess Controls
Sample Size Designs to Assess Controls B. Ricky Rambharat, PhD, PStat Lead Statistician Office of the Comptroller of the Currency U.S. Department of the Treasury Washington, DC FCSM Research Conference
More informationInference on Phasetype Models via MCMC
Inference on Phasetype Models via MCMC with application to networks of repairable redundant systems Louis JM Aslett and Simon P Wilson Trinity College Dublin 28 th June 202 Toy Example : Redundant Repairable
More informationPREDICTIVE DISTRIBUTIONS OF OUTSTANDING LIABILITIES IN GENERAL INSURANCE
PREDICTIVE DISTRIBUTIONS OF OUTSTANDING LIABILITIES IN GENERAL INSURANCE BY P.D. ENGLAND AND R.J. VERRALL ABSTRACT This paper extends the methods introduced in England & Verrall (00), and shows how predictive
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationAN ACCESSIBLE TREATMENT OF MONTE CARLO METHODS, TECHNIQUES, AND APPLICATIONS IN THE FIELD OF FINANCE AND ECONOMICS
Brochure More information from http://www.researchandmarkets.com/reports/2638617/ Handbook in Monte Carlo Simulation. Applications in Financial Engineering, Risk Management, and Economics. Wiley Handbooks
More informationProbabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014
Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about
More informationUsing SAS PROC MCMC to Estimate and Evaluate Item Response Theory Models
Using SAS PROC MCMC to Estimate and Evaluate Item Response Theory Models Clement A Stone Abstract Interest in estimating item response theory (IRT) models using Bayesian methods has grown tremendously
More informationOverview Classes. 123 Logistic regression (5) 193 Building and applying logistic regression (6) 263 Generalizations of logistic regression (7)
Overview Classes 123 Logistic regression (5) 193 Building and applying logistic regression (6) 263 Generalizations of logistic regression (7) 24 Loglinear models (8) 54 1517 hrs; 5B02 Building and
More informationHandling attrition and nonresponse in longitudinal data
Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 6372 Handling attrition and nonresponse in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein
More informationCHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS
Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships
More informationPenalized regression: Introduction
Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20thcentury statistics dealt with maximum likelihood
More informationTime Series Analysis
Time Series Analysis hm@imm.dtu.dk Informatics and Mathematical Modelling Technical University of Denmark DK2800 Kgs. Lyngby 1 Outline of the lecture Identification of univariate time series models, cont.:
More informationProbabilistic Methods for TimeSeries Analysis
Probabilistic Methods for TimeSeries Analysis 2 Contents 1 Analysis of Changepoint Models 1 1.1 Introduction................................ 1 1.1.1 Model and Notation....................... 2 1.1.2 Example:
More informationLogistic Regression. Vibhav Gogate The University of Texas at Dallas. Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld.
Logistic Regression Vibhav Gogate The University of Texas at Dallas Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld. Generative vs. Discriminative Classifiers Want to Learn: h:x Y X features
More informationMachine Learning and Pattern Recognition Logistic Regression
Machine Learning and Pattern Recognition Logistic Regression Course Lecturer:Amos J Storkey Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh Crichton Street,
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
More informationThe CRM for ordinal and multivariate outcomes. Elizabeth GarrettMayer, PhD Emily Van Meter
The CRM for ordinal and multivariate outcomes Elizabeth GarrettMayer, PhD Emily Van Meter Hollings Cancer Center Medical University of South Carolina Outline Part 1: Ordinal toxicity model Part 2: Efficacy
More informationA Bayesian hierarchical surrogate outcome model for multiple sclerosis
A Bayesian hierarchical surrogate outcome model for multiple sclerosis 3 rd Annual ASA New Jersey Chapter / Bayer Statistics Workshop David Ohlssen (Novartis), Luca Pozzi and Heinz Schmidli (Novartis)
More informationThe Exponential Family
The Exponential Family David M. Blei Columbia University November 3, 2015 Definition A probability density in the exponential family has this form where p.x j / D h.x/ expf > t.x/ a./g; (1) is the natural
More informationChenfeng Xiong (corresponding), University of Maryland, College Park (cxiong@umd.edu)
Paper Author (s) Chenfeng Xiong (corresponding), University of Maryland, College Park (cxiong@umd.edu) Lei Zhang, University of Maryland, College Park (lei@umd.edu) Paper Title & Number Dynamic Travel
More informationAdaptive Search with Stochastic Acceptance Probabilities for Global Optimization
Adaptive Search with Stochastic Acceptance Probabilities for Global Optimization Archis Ghate a and Robert L. Smith b a Industrial Engineering, University of Washington, Box 352650, Seattle, Washington,
More informationJoint models for classification and comparison of mortality in different countries.
Joint models for classification and comparison of mortality in different countries. Viani D. Biatat 1 and Iain D. Currie 1 1 Department of Actuarial Mathematics and Statistics, and the Maxwell Institute
More informationThese slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop
Music and Machine Learning (IFT6080 Winter 08) Prof. Douglas Eck, Université de Montréal These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher
More informationLOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as
LOGISTIC REGRESSION Nitin R Patel Logistic regression extends the ideas of multiple linear regression to the situation where the dependent variable, y, is binary (for convenience we often code these values
More informationImputing Values to Missing Data
Imputing Values to Missing Data In federated data, between 30%70% of the data points will have at least one missing attribute  data wastage if we ignore all records with a missing value Remaining data
More informationGraduate Programs in Statistics
Graduate Programs in Statistics Course Titles STAT 100 CALCULUS AND MATR IX ALGEBRA FOR STATISTICS. Differential and integral calculus; infinite series; matrix algebra STAT 195 INTRODUCTION TO MATHEMATICAL
More informationLinear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANACHAMPAIGN
Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANACHAMPAIGN Linear Algebra Slide 1 of
More informationCS 688 Pattern Recognition Lecture 4. Linear Models for Classification
CS 688 Pattern Recognition Lecture 4 Linear Models for Classification Probabilistic generative models Probabilistic discriminative models 1 Generative Approach ( x ) p C k p( C k ) Ck p ( ) ( x Ck ) p(
More informationBayes and Naïve Bayes. cs534machine Learning
Bayes and aïve Bayes cs534machine Learning Bayes Classifier Generative model learns Prediction is made by and where This is often referred to as the Bayes Classifier, because of the use of the Bayes rule
More informationYiming Peng, Department of Statistics. February 12, 2013
Regression Analysis Using JMP Yiming Peng, Department of Statistics February 12, 2013 2 Presentation and Data http://www.lisa.stat.vt.edu Short Courses Regression Analysis Using JMP Download Data to Desktop
More informationLikelihood: Frequentist vs Bayesian Reasoning
"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B University of California, Berkeley Spring 2009 N Hallinan Likelihood: Frequentist vs Bayesian Reasoning Stochastic odels and
More informationThe Proportional Odds Model for Assessing Rater Agreement with Multiple Modalities
The Proportional Odds Model for Assessing Rater Agreement with Multiple Modalities Elizabeth GarrettMayer, PhD Assistant Professor Sidney Kimmel Comprehensive Cancer Center Johns Hopkins University 1
More informationCCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York
BME I5100: Biomedical Signal Processing Linear Discrimination Lucas C. Parra Biomedical Engineering Department CCNY 1 Schedule Week 1: Introduction Linear, stationary, normal  the stuff biology is not
More informationAn Application of Inverse Reinforcement Learning to Medical Records of Diabetes Treatment
An Application of Inverse Reinforcement Learning to Medical Records of Diabetes Treatment Hideki Asoh 1, Masanori Shiro 1 Shotaro Akaho 1, Toshihiro Kamishima 1, Koiti Hasida 1, Eiji Aramaki 2, and Takahide
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN13: 9780470860809 ISBN10: 0470860804 Editors Brian S Everitt & David
More informationDealing with large datasets
Dealing with large datasets (by throwing away most of the data) Alan Heavens Institute for Astronomy, University of Edinburgh with Ben Panter, Rob Tweedie, Mark Bastin, Will Hossack, Keith McKellar, Trevor
More informationExamining credit card consumption pattern
Examining credit card consumption pattern Yuhao Fan (Economics Department, Washington University in St. Louis) Abstract: In this paper, I analyze the consumer s credit card consumption data from a commercial
More informationSTAT3016 Introduction to Bayesian Data Analysis
STAT3016 Introduction to Bayesian Data Analysis Course Description The Bayesian approach to statistics assigns probability distributions to both the data and unknown parameters in the problem. This way,
More informationINDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)
INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) Abstract Indirect inference is a simulationbased method for estimating the parameters of economic models. Its
More informationTOWARD BIG DATA ANALYSIS WORKSHOP
TOWARD BIG DATA ANALYSIS WORKSHOP 邁 向 巨 量 資 料 分 析 研 討 會 摘 要 集 2015.06.0506 巨 量 資 料 之 矩 陣 視 覺 化 陳 君 厚 中 央 研 究 院 統 計 科 學 研 究 所 摘 要 視 覺 化 (Visualization) 與 探 索 式 資 料 分 析 (Exploratory Data Analysis, EDA)
More informationCHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.
CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES From Exploratory Factor Analysis Ledyard R Tucker and Robert C MacCallum 1997 180 CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES In
More informationRegression Modeling Strategies
Frank E. Harrell, Jr. Regression Modeling Strategies With Applications to Linear Models, Logistic Regression, and Survival Analysis With 141 Figures Springer Contents Preface Typographical Conventions
More informationWebbased Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni
1 Webbased Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed
More informationi=1 In practice, the natural logarithm of the likelihood function, called the loglikelihood function and denoted by
Statistics 580 Maximum Likelihood Estimation Introduction Let y (y 1, y 2,..., y n be a vector of iid, random variables from one of a family of distributions on R n and indexed by a pdimensional parameter
More informationSAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More informationBayesian Statistics in One Hour. Patrick Lam
Bayesian Statistics in One Hour Patrick Lam Outline Introduction Bayesian Models Applications Missing Data Hierarchical Models Outline Introduction Bayesian Models Applications Missing Data Hierarchical
More informationMaking Sense of the Mayhem: Machine Learning and March Madness
Making Sense of the Mayhem: Machine Learning and March Madness Alex Tran and Adam Ginzberg Stanford University atran3@stanford.edu ginzberg@stanford.edu I. Introduction III. Model The goal of our research
More informationPoisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study loglinear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
More informationPricing and calibration in local volatility models via fast quantization
Pricing and calibration in local volatility models via fast quantization Parma, 29 th January 2015. Joint work with Giorgia Callegaro and Martino Grasselli Quantization: a brief history Birth: back to
More informationSampling via Moment Sharing: A New Framework for Distributed Bayesian Inference for Big Data
Sampling via Moment Sharing: A New Framework for Distributed Bayesian Inference for Big Data (Oxford) in collaboration with: Minjie Xu, Jun Zhu, Bo Zhang (Tsinghua) Balaji Lakshminarayanan (Gatsby) Bayesian
More information11. Analysis of Casecontrol Studies Logistic Regression
Research methods II 113 11. Analysis of Casecontrol Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
More informationProgram description for the Master s Degree Program in Mathematics and Finance
Program description for the Master s Degree Program in Mathematics and Finance : English: Master s Degree in Mathematics and Finance Norwegian, bokmål: Master i matematikk og finans Norwegian, nynorsk:
More informationStatistics Graduate Courses
Statistics Graduate Courses STAT 7002Topics in StatisticsBiological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
More informationLatent Class Regression Part II
This work is licensed under a Creative Commons AttributionNonCommercialShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationStatistics in Applications III. Distribution Theory and Inference
2.2 Master of Science Degrees The Department of Statistics at FSU offers three different options for an MS degree. 1. The applied statistics degree is for a student preparing for a career as an applied
More informationLecture 6: Logistic Regression
Lecture 6: CS 19410, Fall 2011 Laurent El Ghaoui EECS Department UC Berkeley September 13, 2011 Outline Outline Classification task Data : X = [x 1,..., x m]: a n m matrix of data points in R n. y { 1,
More informationHedge fund pricing and model uncertainty
Hedge fund pricing and model uncertainty Spyridon D.Vrontos a, Ioannis D.Vrontos b, Daniel Giamouridis c, a Department of Statistics and ActuarialFinancial Mathematics, University of Aegean, Samos, Greece
More informationParameter Estimation: A Deterministic Approach using the LevenburgMarquardt Algorithm
Parameter Estimation: A Deterministic Approach using the LevenburgMarquardt Algorithm John Bardsley Department of Mathematical Sciences University of Montana Applied Math SeminarFeb. 2005 p.1/14 Outline
More informationBayesian Statistical Analysis in Medical Research
Bayesian Statistical Analysis in Medical Research David Draper Department of Applied Mathematics and Statistics University of California, Santa Cruz draper@ams.ucsc.edu www.ams.ucsc.edu/ draper ROLE Steering
More informationA Stochastic Model For Critical Illness Insurance
A Stochastic Model For Critical Illness Insurance Erengul Ozkok Submitted for the degree of Doctor of Philosophy on completion of research in the Department of Actuarial Mathematics & Statistics, School
More informationNeural Networks for Machine Learning. Lecture 13a The ups and downs of backpropagation
Neural Networks for Machine Learning Lecture 13a The ups and downs of backpropagation Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdelrahman Mohamed A brief history of backpropagation
More informationBayesian Statistics: Indian Buffet Process
Bayesian Statistics: Indian Buffet Process Ilker Yildirim Department of Brain and Cognitive Sciences University of Rochester Rochester, NY 14627 August 2012 Reference: Most of the material in this note
More informationGLM, insurance pricing & big data: paying attention to convergence issues.
GLM, insurance pricing & big data: paying attention to convergence issues. Michaël NOACK  michael.noack@addactis.com Senior consultant & Manager of ADDACTIS Pricing Copyright 2014 ADDACTIS Worldwide.
More informationGeneralized Linear Models
Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the
More informationCredit Risk Models. August 24 26, 2010
Credit Risk Models August 24 26, 2010 AGENDA 1 st Case Study : Credit Rating Model Borrowers and Factoring (Accounts Receivable Financing) pages 3 10 2 nd Case Study : Credit Scoring Model Automobile Leasing
More informationGamma Distribution Fitting
Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics
More informationBayesian Phylogeny and Measures of Branch Support
Bayesian Phylogeny and Measures of Branch Support Bayesian Statistics Imagine we have a bag containing 100 dice of which we know that 90 are fair and 10 are biased. The
More informationBayesian logistic betting strategy against probability forecasting. Akimichi Takemura, Univ. Tokyo. November 12, 2012
Bayesian logistic betting strategy against probability forecasting Akimichi Takemura, Univ. Tokyo (joint with Masayuki Kumon, Jing Li and Kei Takeuchi) November 12, 2012 arxiv:1204.3496. To appear in Stochastic
More informationBasics of Statistical Machine Learning
CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar
More information