A hidden Markov model for criminal behaviour classification

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "A hidden Markov model for criminal behaviour classification"

Transcription

1 RSS2004 p.1/19 A hidden Markov model for criminal behaviour classification Francesco Bartolucci, Institute of economic sciences, Urbino University, Italy. Fulvia Pennoni, Department of Statistics, University of Florence, Italy.

2 RSS2004 p.2/19 Background Analysis of criminal behaviour: we want to model offending patterns as well as taking into account the nature of offending and the sequence of offence type; criminal histories recorded as official histories: England and Wales Offenders Index which is a court based record of the criminal histories of all offenders in England and Wales from 1963 to the current day; general population sample of n =5, 470 individuals paroled from the cohort of those born in 1953, and followed through to 1993; offences are combined into J =10major categories described in the Offendex Index Codebook (1998); following Francis et al. (2004) we have define T =6time windows or age strips:10-15,16-20, 21-25, 26-30,

3 RSS2004 p.3/19 Univariate Latent Markov model Used by Bijleveld and Mooijaart (2003): the offending pattern of a subject within strip age t, t =,...,T is represented by X t a single discrete random variable; {X t } depends only on a random process {C t }; {C t } follows a first-order homogeneous Markov chain with k states, initial probabilities π c s and transition probabilities π c1 c 2 ; the joint distribution of {X t } may be expressed as p(x 1 = x 1,...,X T = x T )= φ x1 c 1 π c1 φ x2 c 2 π c1 c 2 φ xt c T π ct 1 c T, c 2 c T c 1 where φ x c = p(x t = x C t = c).

4 RSS2004 p.4/19 Multivariate Extension X tj is a binary random variable equal to 1 if he/she is convicted for offence of type j within the strip age t and to 0 otherwise; we assume local independence i.e. that for t =1,..., T, X tj are conditionally independent given C t : φx c = p(x t = x C t = c) = J j=1 λ x j j c (1 λ j c) 1 x j, where λ j c = p(x tj =1 C t = c), X t =(X t1,,x tj ) and x j denotes the j element of the vector x.

5 RSS2004 p.5/19 Restricted version of the model (unidimensional Rasch) We assume that for each type of offence we have logit(λ j c )=α c + β j, (1) where α c is the tendency to commit crimes of the subject in the latent class c (i.e. individual characteristic) β j is the easiness to commit crime of type j; it allows for an appropriate labelling of the latent classes to order the latent classes λ j 1 <= <= λ j k, j =1,...,J, such constrain is used to formulate a latent class version of the Rasch (1961) model which is well-known in the Psychometric literature.

6 RSS2004 p.6/19 Restricted version of the model (multidimensional Rasch) The previous model assumes that each type of offence has the same latent trait: this may be too much restrictive; we consider that the crimes may be partitioned into s homogenous subgroups so that logit(λ j c )= s δ jd α cd + β j, (2) d=1 where α cd is the tendency of the subject in the latent class c to commit crimes in the subgroup d; δ jd is equal to 1 if the crime j is in the subgroup d and to 0 otherwise; we can classify the offences into groups where crimes belonging to the same group have the same latent trait.

7 RSS2004 p.7/19 Likelihood inference The log-likelihood of the model for an observed cohort of n subjects is l(θ) = n log[l i (θ)], i=1 where θ is the notation for all the parameters, L i (θ) is the function p(x i1,...,x it ) defined evaluated at θ. L i (θ) may be computed through the well-known recursions in the hidden Markov literature (see Levinson et al., 1983, and MacDonald and Zucchini, 1997, Sec. 2.2); l(θ) is maximized with the EM algorithm which requires the log-likelihood of the complete data l (θ).

8 RSS2004 p.8/19 The complete data log-likelihood may be expressed as l (θ) = v 1c log π c + u c1 c 2 log π c1 c 2 + c c 1 c 2 v itc {x itj log λ cj +(1 x itj )log(1 λ cj )}, i t c j where v itc is a dummy variable, referred to the i-th subject, which is equal to 1 if C t = c and to 0 otherwise, v tc = i v itc and u c1 c 2 is the number of transitions from the c 1 -th to the c 2 -th state.

9 RSS2004 p.9/19 EM algorithm E : computes the conditional expected value of l (θ), given the observed data and the current value of the parameters. M : updates the parameter estimates by maximizing the expected value of l (θ) computed above. When the model is constrained (unidimensional or multidimensional Rasch) the parameters α cd and β j are estimated by fitting a logistic model with a suitable design matrix Z defined according to the model of interest to the data.

10 RSS2004 p.10/19 Choice of the number of classes (k) The optimal number of latent classes can be chosen with the likelihood ratio between the model with k states and that with k +1 states, D k = 2(ˆl k ˆl k+1 ), for increasing values of k; or using the Bayesian Information Criterion (Kass and Raftery, 1995) defined as BIC k = 2l k + r k log(n) where r k is the number of parameters in the model with k states. According to this strategy, the optimal number of states is the one for that BIC k is minimum.

11 RSS2004 p.11/19 Choice of the number of latent traits The crimes are clustered using a hierarchical algorithm. At each step the algorithm aggregates the two cluster of crimes which are the closest in terms of deviance between the model fitted at the previous step and the multidimensional Rasch model fitted after the aggregation of the two clusters. The steps are iterated until the BIC of the resulting model is lower than the unconstrained model. The algorithm stops when all the items are grouped together.

12 An application We applied the model to a sample of n =5, 470 males taken from the dataset illustrated above; we used the estimated number of live births in the cohort year 1953 as reported by Prime et al. (2001). For a number of classes between 1 and 7 we obtain k l k r k BIC k 1 21, , , , , , , , , , , , , , 036 We choose k =5states as we have the smallest BIC. RSS2004 p.12/19

13 RSS2004 p.13/19 Choice of the clusters Using the hierarchical algorithm the best fit (BIC =35, 433) was for the following cluster aggregations for each of the the 10 typology of crimes and the estimation of β s. latent trait Offence s category (j) β j Violence against the person X Sexual offences X Burglary X Robbery X Theft and handling stolen goods X Fraud and Forgery X Criminal Damage X Drug Offences X Motoring Offences X Other offences X 7.493

14 RSS2004 p.14/19 Estimated α s parameters Values of the estimated tendencies of the subject for each latent state in every subgroup c α 1 α 2 α

15 Estimate of π and Π Initial probabilities π c π 1 π 2 π 3 π 4 π Transition probabilities π cd s of the Markov Chain are the following c RSS2004 p.15/19

16 RSS2004 p.16/19 Advantages of the proposed methodology We achieve parsimonious description of the dynamic process underlying the data; the approach is based on general population sample and not on an offender-based sample as in other studies; it allows to estimate a waste choice of models and to choose the best one going to the simple latent class model to the constrained model with subgroups; it can provide important information for policy, such as incarceration or incapacitation policy against the offenders.

17 RSS2004 p.17/19 Future extensions Constraint the probabilities λ j c s to be equal to 0 for a latent class so that this class may be identified as that of non-offensive subjects; consider also models in which the transition probabilities may vary with age (non homogeneous of the Markov chains); consider restriced models in which the transition matrix has a particular structure (e.g. triangular, symmetric); include explanatory variables, such as gender or race, in the model.

18 RSS2004 p.18/19 References Bijleveld, C. J. H., and Mooijaart, A. Neerlandica, 57, 3, (2003). Latent Markov Modelling of Recidivism Data. Statistica (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. R. Statist. Soc. series B, 39, Dempster, A. P., Laird, N. M. and Rubin, D. B. (1996). Using Bootstrap Likelihood Ratios in Finite Mixture Models. J. R. Statist. Soc., B, 58, Feng, Z. and McCulloch, C. E. (2004). Identifying Patterns and Pathways of Offending Behaviour: A New Approach to Typologies of Crime. European Journal of Criminology, 1, Francis, B., Soothill, K. and Fligelstone, R. Kass R. E. and Raftery A. (1995). Bayes factors. Journal of the American Statistical Association, 90 (430), Lazarsfeld, P. F. and Henry, N. W (1968). Latent Structure Analysis. Boston: Houghton Mifflin. Levinson S. E., Rabiner, L. R. and Sondhi, M. M. (1983). An introduction to an application of theory of probabilistic functions of a Markov process to automatic speech recognition. Bell System Thechnical Journal, 62, (1991). Semiparametric estimation in the Rasch model and related exponential response models, including a simple latent class model for item analysis. Journal of the American Statistical Association, 86, Lindsay, B., Clogg, C. and Grego, J.

19 RSS2004 p.19/19 (1995). Patterns of drug use among white institutionalized delinquents in Georgia. Evidence from a latent class analysis. Journal of Drug Education, 25, McCutcheon, A. L. and Thomas, G. (1997). Hidden Markov and Other Models for Discrete-valued Time Series. London: Chapman & Hall. MacDonald I. and Zucchini W. McLachlan, G. J. and Peel, D. (2000). Finite Mixture Models, New York, John and Wiley. (1998). Offenders Index Codebook, London: Home Office. Available at Research development and Statistics Directorate (2001). Criminal careers of those born between 1953 and Statistical Bulletin 4/01. London: Home Office. Prime, J., White, S., Liriano, S. and Patel, K. Rasch, G. (1961). On general laws and the meaning of measurement in psychology, Proceedings of the IV Berkeley Symposium on Mathematical Statistics and Probability, 4, (1973). Panel Analysis: Latent Probability Models for Attitudes and Behavior Processes. Amsterdam: Elsevier. Wiggins, L. M.

Item selection by latent class-based methods: an application to nursing homes evaluation

Item selection by latent class-based methods: an application to nursing homes evaluation Item selection by latent class-based methods: an application to nursing homes evaluation Francesco Bartolucci, Giorgio E. Montanari, Silvia Pandolfi 1 Department of Economics, Finance and Statistics University

More information

Introduction to latent variable models

Introduction to latent variable models Introduction to latent variable models lecture 1 Francesco Bartolucci Department of Economics, Finance and Statistics University of Perugia, IT bart@stat.unipg.it Outline [2/24] Latent variables and their

More information

Using Mixtures-of-Distributions models to inform farm size selection decisions in representative farm modelling. Philip Kostov and Seamus McErlean

Using Mixtures-of-Distributions models to inform farm size selection decisions in representative farm modelling. Philip Kostov and Seamus McErlean Using Mixtures-of-Distributions models to inform farm size selection decisions in representative farm modelling. by Philip Kostov and Seamus McErlean Working Paper, Agricultural and Food Economics, Queen

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

More information

Robotics 2 Clustering & EM. Giorgio Grisetti, Cyrill Stachniss, Kai Arras, Maren Bennewitz, Wolfram Burgard

Robotics 2 Clustering & EM. Giorgio Grisetti, Cyrill Stachniss, Kai Arras, Maren Bennewitz, Wolfram Burgard Robotics 2 Clustering & EM Giorgio Grisetti, Cyrill Stachniss, Kai Arras, Maren Bennewitz, Wolfram Burgard 1 Clustering (1) Common technique for statistical data analysis to detect structure (machine learning,

More information

Clustering - example. Given some data x i X Find a partitioning of the data into k disjunctive clusters Example: k-means clustering

Clustering - example. Given some data x i X Find a partitioning of the data into k disjunctive clusters Example: k-means clustering Clustering - example Graph Mining and Graph Kernels Given some data x i X Find a partitioning of the data into k disjunctive clusters Example: k-means clustering x!!!!8!!! 8 x 1 1 Clustering - example

More information

The Start of a Criminal Career: Does the Type of Debut Offence Predict Future Offending? Research Report 77. Natalie Owen & Christine Cooper

The Start of a Criminal Career: Does the Type of Debut Offence Predict Future Offending? Research Report 77. Natalie Owen & Christine Cooper The Start of a Criminal Career: Does the Type of Debut Offence Predict Future Offending? Research Report 77 Natalie Owen & Christine Cooper November 2013 Contents Executive Summary... 3 Introduction...

More information

Bayesian networks - Time-series models - Apache Spark & Scala

Bayesian networks - Time-series models - Apache Spark & Scala Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly

More information

Chenfeng Xiong (corresponding), University of Maryland, College Park (cxiong@umd.edu)

Chenfeng Xiong (corresponding), University of Maryland, College Park (cxiong@umd.edu) Paper Author (s) Chenfeng Xiong (corresponding), University of Maryland, College Park (cxiong@umd.edu) Lei Zhang, University of Maryland, College Park (lei@umd.edu) Paper Title & Number Dynamic Travel

More information

Lecture 10: Sequential Data Models

Lecture 10: Sequential Data Models CSC2515 Fall 2007 Introduction to Machine Learning Lecture 10: Sequential Data Models 1 Example: sequential data Until now, considered data to be i.i.d. Turn attention to sequential data Time-series: stock

More information

Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

More information

Crime Location Crime Type Month Year Betting Shop Criminal Damage April 2010 Betting Shop Theft April 2010 Betting Shop Assault April 2010

Crime Location Crime Type Month Year Betting Shop Criminal Damage April 2010 Betting Shop Theft April 2010 Betting Shop Assault April 2010 Crime Location Crime Type Month Year Betting Shop Theft April 2010 Betting Shop Assault April 2010 Betting Shop Theft April 2010 Betting Shop Theft April 2010 Betting Shop Assault April 2010 Betting Shop

More information

Parametric Models Part I: Maximum Likelihood and Bayesian Density Estimation

Parametric Models Part I: Maximum Likelihood and Bayesian Density Estimation Parametric Models Part I: Maximum Likelihood and Bayesian Density Estimation Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2015 CS 551, Fall 2015

More information

A general statistical framework for assessing Granger causality

A general statistical framework for assessing Granger causality A general statistical framework for assessing Granger causality The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published

More information

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines

More information

Item Response Theory in R using Package ltm

Item Response Theory in R using Package ltm Item Response Theory in R using Package ltm Dimitris Rizopoulos Department of Biostatistics, Erasmus University Medical Center, the Netherlands d.rizopoulos@erasmusmc.nl Department of Statistics and Mathematics

More information

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max

More information

Reject Inference in Credit Scoring. Jie-Men Mok

Reject Inference in Credit Scoring. Jie-Men Mok Reject Inference in Credit Scoring Jie-Men Mok BMI paper January 2009 ii Preface In the Master programme of Business Mathematics and Informatics (BMI), it is required to perform research on a business

More information

Model-Based Cluster Analysis for Web Users Sessions

Model-Based Cluster Analysis for Web Users Sessions Model-Based Cluster Analysis for Web Users Sessions George Pallis, Lefteris Angelis, and Athena Vakali Department of Informatics, Aristotle University of Thessaloniki, 54124, Thessaloniki, Greece gpallis@ccf.auth.gr

More information

Cell Phone based Activity Detection using Markov Logic Network

Cell Phone based Activity Detection using Markov Logic Network Cell Phone based Activity Detection using Markov Logic Network Somdeb Sarkhel sxs104721@utdallas.edu 1 Introduction Mobile devices are becoming increasingly sophisticated and the latest generation of smart

More information

Package MixGHD. June 26, 2015

Package MixGHD. June 26, 2015 Type Package Package MixGHD June 26, 2015 Title Model Based Clustering, Classification and Discriminant Analysis Using the Mixture of Generalized Hyperbolic Distributions Version 1.7 Date 2015-6-15 Author

More information

Statistics in Retail Finance. Chapter 6: Behavioural models

Statistics in Retail Finance. Chapter 6: Behavioural models Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics:- Behavioural

More information

Conditional Random Fields: An Introduction

Conditional Random Fields: An Introduction Conditional Random Fields: An Introduction Hanna M. Wallach February 24, 2004 1 Labeling Sequential Data The task of assigning label sequences to a set of observation sequences arises in many fields, including

More information

Course: Model, Learning, and Inference: Lecture 5

Course: Model, Learning, and Inference: Lecture 5 Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.

More information

Linear Classification. Volker Tresp Summer 2015

Linear Classification. Volker Tresp Summer 2015 Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong

More information

Note on the EM Algorithm in Linear Regression Model

Note on the EM Algorithm in Linear Regression Model International Mathematical Forum 4 2009 no. 38 1883-1889 Note on the M Algorithm in Linear Regression Model Ji-Xia Wang and Yu Miao College of Mathematics and Information Science Henan Normal University

More information

Classifying Galaxies using a data-driven approach

Classifying Galaxies using a data-driven approach Classifying Galaxies using a data-driven approach Supervisor : Prof. David van Dyk Department of Mathematics Imperial College London London, April 2015 Outline The Classification Problem 1 The Classification

More information

Curriculum Vitae of Francesco Bartolucci

Curriculum Vitae of Francesco Bartolucci Curriculum Vitae of Francesco Bartolucci Department of Economics, Finance and Statistics University of Perugia Via A. Pascoli, 20 06123 Perugia (IT) email: bart@stat.unipg.it http://www.stat.unipg.it/bartolucci

More information

The Probit Link Function in Generalized Linear Models for Data Mining Applications

The Probit Link Function in Generalized Linear Models for Data Mining Applications Journal of Modern Applied Statistical Methods Copyright 2013 JMASM, Inc. May 2013, Vol. 12, No. 1, 164-169 1538 9472/13/$95.00 The Probit Link Function in Generalized Linear Models for Data Mining Applications

More information

ASC 076 INTRODUCTION TO SOCIAL AND CRIMINAL PSYCHOLOGY

ASC 076 INTRODUCTION TO SOCIAL AND CRIMINAL PSYCHOLOGY DIPLOMA IN CRIME MANAGEMENT AND PREVENTION COURSES DESCRIPTION ASC 075 INTRODUCTION TO SOCIOLOGY AND ANTHROPOLOGY Defining Sociology and Anthropology, Emergence of Sociology, subject matter and subdisciplines.

More information

SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

More information

Structural Equation Models: Mixture Models

Structural Equation Models: Mixture Models Structural Equation Models: Mixture Models Jeroen K. Vermunt Department of Methodology and Statistics Tilburg University Jay Magidson Statistical Innovations Inc. 1 Introduction This article discusses

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

Likelihood Approaches for Trial Designs in Early Phase Oncology

Likelihood Approaches for Trial Designs in Early Phase Oncology Likelihood Approaches for Trial Designs in Early Phase Oncology Clinical Trials Elizabeth Garrett-Mayer, PhD Cody Chiuzan, PhD Hollings Cancer Center Department of Public Health Sciences Medical University

More information

Statistical Analysis with Missing Data

Statistical Analysis with Missing Data Statistical Analysis with Missing Data Second Edition RODERICK J. A. LITTLE DONALD B. RUBIN WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents Preface PARTI OVERVIEW AND BASIC APPROACHES

More information

Support Vector Machines with Clustering for Training with Very Large Datasets

Support Vector Machines with Clustering for Training with Very Large Datasets Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France theodoros.evgeniou@insead.fr Massimiliano

More information

Introduction to mixed model and missing data issues in longitudinal studies

Introduction to mixed model and missing data issues in longitudinal studies Introduction to mixed model and missing data issues in longitudinal studies Hélène Jacqmin-Gadda INSERM, U897, Bordeaux, France Inserm workshop, St Raphael Outline of the talk I Introduction Mixed models

More information

MS1b Statistical Data Mining

MS1b Statistical Data Mining MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to

More information

Statistical Machine Learning from Data

Statistical Machine Learning from Data Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Gaussian Mixture Models Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

Christfried Webers. Canberra February June 2015

Christfried Webers. Canberra February June 2015 c Statistical Group and College of Engineering and Computer Science Canberra February June (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 829 c Part VIII Linear Classification 2 Logistic

More information

The Exponential Family

The Exponential Family The Exponential Family David M. Blei Columbia University November 3, 2015 Definition A probability density in the exponential family has this form where p.x j / D h.x/ expf > t.x/ a./g; (1) is the natural

More information

Automated Hierarchical Mixtures of Probabilistic Principal Component Analyzers

Automated Hierarchical Mixtures of Probabilistic Principal Component Analyzers Automated Hierarchical Mixtures of Probabilistic Principal Component Analyzers Ting Su tsu@ece.neu.edu Jennifer G. Dy jdy@ece.neu.edu Department of Electrical and Computer Engineering, Northeastern University,

More information

Data a systematic approach

Data a systematic approach Pattern Discovery on Australian Medical Claims Data a systematic approach Ah Chung Tsoi Senior Member, IEEE, Shu Zhang, Markus Hagenbuchner Member, IEEE Abstract The national health insurance system in

More information

DATA ANALYTICS USING R

DATA ANALYTICS USING R DATA ANALYTICS USING R Duration: 90 Hours Intended audience and scope: The course is targeted at fresh engineers, practicing engineers and scientists who are interested in learning and understanding data

More information

Message-passing sequential detection of multiple change points in networks

Message-passing sequential detection of multiple change points in networks Message-passing sequential detection of multiple change points in networks Long Nguyen, Arash Amini Ram Rajagopal University of Michigan Stanford University ISIT, Boston, July 2012 Nguyen/Amini/Rajagopal

More information

Model-Based Recursive Partitioning for Detecting Interaction Effects in Subgroups

Model-Based Recursive Partitioning for Detecting Interaction Effects in Subgroups Model-Based Recursive Partitioning for Detecting Interaction Effects in Subgroups Achim Zeileis, Torsten Hothorn, Kurt Hornik http://eeecon.uibk.ac.at/~zeileis/ Overview Motivation: Trees, leaves, and

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical

More information

Detection of changes in variance using binary segmentation and optimal partitioning

Detection of changes in variance using binary segmentation and optimal partitioning Detection of changes in variance using binary segmentation and optimal partitioning Christian Rohrbeck Abstract This work explores the performance of binary segmentation and optimal partitioning in the

More information

Probabilistic trust models in network security

Probabilistic trust models in network security UNIVERSITY OF SOUTHAMPTON Probabilistic trust models in network security by Ehab M. ElSalamouny A thesis submitted in partial fulfillment for the degree of Doctor of Philosophy in the Faculty of Engineering

More information

An introduction to Hidden Markov Models

An introduction to Hidden Markov Models An introduction to Hidden Markov Models Christian Kohlschein Abstract Hidden Markov Models (HMM) are commonly defined as stochastic finite state machines. Formally a HMM can be described as a 5-tuple Ω

More information

MACHINE LEARNING IN HIGH ENERGY PHYSICS

MACHINE LEARNING IN HIGH ENERGY PHYSICS MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

Hypothesis Testing. 1 Introduction. 2 Hypotheses. 2.1 Null and Alternative Hypotheses. 2.2 Simple vs. Composite. 2.3 One-Sided and Two-Sided Tests

Hypothesis Testing. 1 Introduction. 2 Hypotheses. 2.1 Null and Alternative Hypotheses. 2.2 Simple vs. Composite. 2.3 One-Sided and Two-Sided Tests Hypothesis Testing 1 Introduction This document is a simple tutorial on hypothesis testing. It presents the basic concepts and definitions as well as some frequently asked questions associated with hypothesis

More information

Statistics Graduate Courses

Statistics Graduate Courses Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for

More information

Fitting Subject-specific Curves to Grouped Longitudinal Data

Fitting Subject-specific Curves to Grouped Longitudinal Data Fitting Subject-specific Curves to Grouped Longitudinal Data Djeundje, Viani Heriot-Watt University, Department of Actuarial Mathematics & Statistics Edinburgh, EH14 4AS, UK E-mail: vad5@hw.ac.uk Currie,

More information

A HYBRID GENETIC ALGORITHM FOR THE MAXIMUM LIKELIHOOD ESTIMATION OF MODELS WITH MULTIPLE EQUILIBRIA: A FIRST REPORT

A HYBRID GENETIC ALGORITHM FOR THE MAXIMUM LIKELIHOOD ESTIMATION OF MODELS WITH MULTIPLE EQUILIBRIA: A FIRST REPORT New Mathematics and Natural Computation Vol. 1, No. 2 (2005) 295 303 c World Scientific Publishing Company A HYBRID GENETIC ALGORITHM FOR THE MAXIMUM LIKELIHOOD ESTIMATION OF MODELS WITH MULTIPLE EQUILIBRIA:

More information

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop Music and Machine Learning (IFT6080 Winter 08) Prof. Douglas Eck, Université de Montréal These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher

More information

Female offenders and child dependents. Ministry of Justice

Female offenders and child dependents. Ministry of Justice Female offenders and child dependents Ministry of Justice 08 October 2015 Previous estimates of the proportion of female offenders who have child dependents at the time of their disposal have been based

More information

A crash course in probability and Naïve Bayes classification

A crash course in probability and Naïve Bayes classification Probability theory A crash course in probability and Naïve Bayes classification Chapter 9 Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s

More information

Machine Learning and Data Mining. Clustering. (adapted from) Prof. Alexander Ihler

Machine Learning and Data Mining. Clustering. (adapted from) Prof. Alexander Ihler Machine Learning and Data Mining Clustering (adapted from) Prof. Alexander Ihler Unsupervised learning Supervised learning Predict target value ( y ) given features ( x ) Unsupervised learning Understand

More information

Health Status Monitoring Through Analysis of Behavioral Patterns

Health Status Monitoring Through Analysis of Behavioral Patterns Health Status Monitoring Through Analysis of Behavioral Patterns Tracy Barger 1, Donald Brown 1, and Majd Alwan 2 1 University of Virginia, Systems and Information Engineering, Charlottesville, VA 2 University

More information

APPLIED MISSING DATA ANALYSIS

APPLIED MISSING DATA ANALYSIS APPLIED MISSING DATA ANALYSIS Craig K. Enders Series Editor's Note by Todd D. little THE GUILFORD PRESS New York London Contents 1 An Introduction to Missing Data 1 1.1 Introduction 1 1.2 Chapter Overview

More information

Nominal and ordinal logistic regression

Nominal and ordinal logistic regression Nominal and ordinal logistic regression April 26 Nominal and ordinal logistic regression Our goal for today is to briefly go over ways to extend the logistic regression model to the case where the outcome

More information

Bayesian Statistics in One Hour. Patrick Lam

Bayesian Statistics in One Hour. Patrick Lam Bayesian Statistics in One Hour Patrick Lam Outline Introduction Bayesian Models Applications Missing Data Hierarchical Models Outline Introduction Bayesian Models Applications Missing Data Hierarchical

More information

Bayesian logistic betting strategy against probability forecasting. Akimichi Takemura, Univ. Tokyo. November 12, 2012

Bayesian logistic betting strategy against probability forecasting. Akimichi Takemura, Univ. Tokyo. November 12, 2012 Bayesian logistic betting strategy against probability forecasting Akimichi Takemura, Univ. Tokyo (joint with Masayuki Kumon, Jing Li and Kei Takeuchi) November 12, 2012 arxiv:1204.3496. To appear in Stochastic

More information

Gerry Hobbs, Department of Statistics, West Virginia University

Gerry Hobbs, Department of Statistics, West Virginia University Decision Trees as a Predictive Modeling Method Gerry Hobbs, Department of Statistics, West Virginia University Abstract Predictive modeling has become an important area of interest in tasks such as credit

More information

QDquaderni. UP-DRES User Profiling for a Dynamic REcommendation System E. Messina, D. Toscani, F. Archetti. university of milano bicocca

QDquaderni. UP-DRES User Profiling for a Dynamic REcommendation System E. Messina, D. Toscani, F. Archetti. university of milano bicocca A01 084/01 university of milano bicocca QDquaderni department of informatics, systems and communication UP-DRES User Profiling for a Dynamic REcommendation System E. Messina, D. Toscani, F. Archetti research

More information

A tutorial on Bayesian model selection. and on the BMSL Laplace approximation

A tutorial on Bayesian model selection. and on the BMSL Laplace approximation A tutorial on Bayesian model selection and on the BMSL Laplace approximation Jean-Luc (schwartz@icp.inpg.fr) Institut de la Communication Parlée, CNRS UMR 5009, INPG-Université Stendhal INPG, 46 Av. Félix

More information

Question 2 Naïve Bayes (16 points)

Question 2 Naïve Bayes (16 points) Question 2 Naïve Bayes (16 points) About 2/3 of your email is spam so you downloaded an open source spam filter based on word occurrences that uses the Naive Bayes classifier. Assume you collected the

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 5: Decision Theory & ROC Curves Gaussian ML Estimation Many figures courtesy Kevin Murphy s textbook,

More information

A Bayesian Antidote Against Strategy Sprawl

A Bayesian Antidote Against Strategy Sprawl A Bayesian Antidote Against Strategy Sprawl Benjamin Scheibehenne (benjamin.scheibehenne@unibas.ch) University of Basel, Missionsstrasse 62a 4055 Basel, Switzerland & Jörg Rieskamp (joerg.rieskamp@unibas.ch)

More information

UW CSE Technical Report 03-06-01 Probabilistic Bilinear Models for Appearance-Based Vision

UW CSE Technical Report 03-06-01 Probabilistic Bilinear Models for Appearance-Based Vision UW CSE Technical Report 03-06-01 Probabilistic Bilinear Models for Appearance-Based Vision D.B. Grimes A.P. Shon R.P.N. Rao Dept. of Computer Science and Engineering University of Washington Seattle, WA

More information

Poisson Models for Count Data

Poisson Models for Count Data Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the

More information

Latent Class (Finite Mixture) Segments How to find them and what to do with them

Latent Class (Finite Mixture) Segments How to find them and what to do with them Latent Class (Finite Mixture) Segments How to find them and what to do with them Jay Magidson Statistical Innovations Inc. Belmont, MA USA www.statisticalinnovations.com Sensometrics 2010, Rotterdam Overview

More information

An Outcome Analysis of Connecticut s Halfway House Programs

An Outcome Analysis of Connecticut s Halfway House Programs An Outcome Analysis of Connecticut s Halfway House Programs Stephen M. Cox, Ph.D. Professor Department of Criminology and Criminal Justice Central Connecticut State University Study Impetus and Purpose

More information

Review of the Methods for Handling Missing Data in. Longitudinal Data Analysis

Review of the Methods for Handling Missing Data in. Longitudinal Data Analysis Int. Journal of Math. Analysis, Vol. 5, 2011, no. 1, 1-13 Review of the Methods for Handling Missing Data in Longitudinal Data Analysis Michikazu Nakai and Weiming Ke Department of Mathematics and Statistics

More information

Language Modeling. Chapter 1. 1.1 Introduction

Language Modeling. Chapter 1. 1.1 Introduction Chapter 1 Language Modeling (Course notes for NLP by Michael Collins, Columbia University) 1.1 Introduction In this chapter we will consider the the problem of constructing a language model from a set

More information

Pattern Analysis. Logistic Regression. 12. Mai 2009. Joachim Hornegger. Chair of Pattern Recognition Erlangen University

Pattern Analysis. Logistic Regression. 12. Mai 2009. Joachim Hornegger. Chair of Pattern Recognition Erlangen University Pattern Analysis Logistic Regression 12. Mai 2009 Joachim Hornegger Chair of Pattern Recognition Erlangen University Pattern Analysis 2 / 43 1 Logistic Regression Posteriors and the Logistic Function Decision

More information

Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach

Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach Refik Soyer * Department of Management Science The George Washington University M. Murat Tarimcilar Department of Management Science

More information

Bayesian Statistics: Indian Buffet Process

Bayesian Statistics: Indian Buffet Process Bayesian Statistics: Indian Buffet Process Ilker Yildirim Department of Brain and Cognitive Sciences University of Rochester Rochester, NY 14627 August 2012 Reference: Most of the material in this note

More information

Hypothesis testing and the error of the third kind

Hypothesis testing and the error of the third kind Psychological Test and Assessment Modeling, Volume 54, 22 (), 9-99 Hypothesis testing and the error of the third kind Dieter Rasch Abstract In this note it is shown that the concept of an error of the

More information

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships

More information

A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data

A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data Faming Liang University of Florida August 9, 2015 Abstract MCMC methods have proven to be a very powerful tool for analyzing

More information

Tutorial on variational approximation methods. Tommi S. Jaakkola MIT AI Lab

Tutorial on variational approximation methods. Tommi S. Jaakkola MIT AI Lab Tutorial on variational approximation methods Tommi S. Jaakkola MIT AI Lab tommi@ai.mit.edu Tutorial topics A bit of history Examples of variational methods A brief intro to graphical models Variational

More information

Standard errors of marginal effects in the heteroskedastic probit model

Standard errors of marginal effects in the heteroskedastic probit model Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic

More information

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning. Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott

More information

Package EstCRM. July 13, 2015

Package EstCRM. July 13, 2015 Version 1.4 Date 2015-7-11 Package EstCRM July 13, 2015 Title Calibrating Parameters for the Samejima's Continuous IRT Model Author Cengiz Zopluoglu Maintainer Cengiz Zopluoglu

More information

A mixture model for random graphs

A mixture model for random graphs A mixture model for random graphs J-J Daudin, F. Picard, S. Robin robin@inapg.inra.fr UMR INA-PG / ENGREF / INRA, Paris Mathématique et Informatique Appliquées Examples of networks. Social: Biological:

More information

CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

More information

Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and

More information

Central Statistics Office (CSO) Recorded Crime Statistics Frequently Asked Questions

Central Statistics Office (CSO) Recorded Crime Statistics Frequently Asked Questions Central Statistics Office (CSO) Recorded Crime Statistics Frequently Asked Questions 26th June 2014 Introduction. The purposes of this document is to address some commonly asked questions about CSO recorded

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

METHOD OF MOMENTS LEARNING FOR LEFT-TO-RIGHT HIDDEN MARKOV MODELS

METHOD OF MOMENTS LEARNING FOR LEFT-TO-RIGHT HIDDEN MARKOV MODELS METHOD OF MOMENTS LEARNING FOR LEFT-TO-RIGHT HIDDEN MARKOV MODELS Y. Cem Subakan [, Johannes Traa ], Paris Smaragdis [,],\, Daniel Hsu ]] [ UIUC Computer Science Department, ]] Columbia University Computer

More information

6. If there is no improvement of the categories after several steps, then choose new seeds using another criterion (e.g. the objects near the edge of

6. If there is no improvement of the categories after several steps, then choose new seeds using another criterion (e.g. the objects near the edge of Clustering Clustering is an unsupervised learning method: there is no target value (class label) to be predicted, the goal is finding common patterns or grouping similar examples. Differences between models/algorithms

More information

Questionnaire: Domestic (Gender and Family) Violence Interventions

Questionnaire: Domestic (Gender and Family) Violence Interventions Questionnaire: Domestic (Gender and Family) Violence Interventions STRENGTHENING TRANSNATIONAL APPROACHES TO REDUCING REOFFENDING (STARR) On behalf of The Institute of Criminology STRENGTHENING TRANSNATIONAL

More information

An Extension of the CHAID Tree-based Segmentation Algorithm to Multiple Dependent Variables

An Extension of the CHAID Tree-based Segmentation Algorithm to Multiple Dependent Variables An Extension of the CHAID Tree-based Segmentation Algorithm to Multiple Dependent Variables Jay Magidson 1 and Jeroen K. Vermunt 2 1 Statistical Innovations Inc., 375 Concord Avenue, Belmont, MA 02478,

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

Methods of Data Analysis Working with probability distributions

Methods of Data Analysis Working with probability distributions Methods of Data Analysis Working with probability distributions Week 4 1 Motivation One of the key problems in non-parametric data analysis is to create a good model of a generating probability distribution,

More information