A hidden Markov model for criminal behaviour classification


 Lorena Kennedy
 2 years ago
 Views:
Transcription
1 RSS2004 p.1/19 A hidden Markov model for criminal behaviour classification Francesco Bartolucci, Institute of economic sciences, Urbino University, Italy. Fulvia Pennoni, Department of Statistics, University of Florence, Italy.
2 RSS2004 p.2/19 Background Analysis of criminal behaviour: we want to model offending patterns as well as taking into account the nature of offending and the sequence of offence type; criminal histories recorded as official histories: England and Wales Offenders Index which is a court based record of the criminal histories of all offenders in England and Wales from 1963 to the current day; general population sample of n =5, 470 individuals paroled from the cohort of those born in 1953, and followed through to 1993; offences are combined into J =10major categories described in the Offendex Index Codebook (1998); following Francis et al. (2004) we have define T =6time windows or age strips:1015,1620, 2125, 2630,
3 RSS2004 p.3/19 Univariate Latent Markov model Used by Bijleveld and Mooijaart (2003): the offending pattern of a subject within strip age t, t =,...,T is represented by X t a single discrete random variable; {X t } depends only on a random process {C t }; {C t } follows a firstorder homogeneous Markov chain with k states, initial probabilities π c s and transition probabilities π c1 c 2 ; the joint distribution of {X t } may be expressed as p(x 1 = x 1,...,X T = x T )= φ x1 c 1 π c1 φ x2 c 2 π c1 c 2 φ xt c T π ct 1 c T, c 2 c T c 1 where φ x c = p(x t = x C t = c).
4 RSS2004 p.4/19 Multivariate Extension X tj is a binary random variable equal to 1 if he/she is convicted for offence of type j within the strip age t and to 0 otherwise; we assume local independence i.e. that for t =1,..., T, X tj are conditionally independent given C t : φx c = p(x t = x C t = c) = J j=1 λ x j j c (1 λ j c) 1 x j, where λ j c = p(x tj =1 C t = c), X t =(X t1,,x tj ) and x j denotes the j element of the vector x.
5 RSS2004 p.5/19 Restricted version of the model (unidimensional Rasch) We assume that for each type of offence we have logit(λ j c )=α c + β j, (1) where α c is the tendency to commit crimes of the subject in the latent class c (i.e. individual characteristic) β j is the easiness to commit crime of type j; it allows for an appropriate labelling of the latent classes to order the latent classes λ j 1 <= <= λ j k, j =1,...,J, such constrain is used to formulate a latent class version of the Rasch (1961) model which is wellknown in the Psychometric literature.
6 RSS2004 p.6/19 Restricted version of the model (multidimensional Rasch) The previous model assumes that each type of offence has the same latent trait: this may be too much restrictive; we consider that the crimes may be partitioned into s homogenous subgroups so that logit(λ j c )= s δ jd α cd + β j, (2) d=1 where α cd is the tendency of the subject in the latent class c to commit crimes in the subgroup d; δ jd is equal to 1 if the crime j is in the subgroup d and to 0 otherwise; we can classify the offences into groups where crimes belonging to the same group have the same latent trait.
7 RSS2004 p.7/19 Likelihood inference The loglikelihood of the model for an observed cohort of n subjects is l(θ) = n log[l i (θ)], i=1 where θ is the notation for all the parameters, L i (θ) is the function p(x i1,...,x it ) defined evaluated at θ. L i (θ) may be computed through the wellknown recursions in the hidden Markov literature (see Levinson et al., 1983, and MacDonald and Zucchini, 1997, Sec. 2.2); l(θ) is maximized with the EM algorithm which requires the loglikelihood of the complete data l (θ).
8 RSS2004 p.8/19 The complete data loglikelihood may be expressed as l (θ) = v 1c log π c + u c1 c 2 log π c1 c 2 + c c 1 c 2 v itc {x itj log λ cj +(1 x itj )log(1 λ cj )}, i t c j where v itc is a dummy variable, referred to the ith subject, which is equal to 1 if C t = c and to 0 otherwise, v tc = i v itc and u c1 c 2 is the number of transitions from the c 1 th to the c 2 th state.
9 RSS2004 p.9/19 EM algorithm E : computes the conditional expected value of l (θ), given the observed data and the current value of the parameters. M : updates the parameter estimates by maximizing the expected value of l (θ) computed above. When the model is constrained (unidimensional or multidimensional Rasch) the parameters α cd and β j are estimated by fitting a logistic model with a suitable design matrix Z defined according to the model of interest to the data.
10 RSS2004 p.10/19 Choice of the number of classes (k) The optimal number of latent classes can be chosen with the likelihood ratio between the model with k states and that with k +1 states, D k = 2(ˆl k ˆl k+1 ), for increasing values of k; or using the Bayesian Information Criterion (Kass and Raftery, 1995) defined as BIC k = 2l k + r k log(n) where r k is the number of parameters in the model with k states. According to this strategy, the optimal number of states is the one for that BIC k is minimum.
11 RSS2004 p.11/19 Choice of the number of latent traits The crimes are clustered using a hierarchical algorithm. At each step the algorithm aggregates the two cluster of crimes which are the closest in terms of deviance between the model fitted at the previous step and the multidimensional Rasch model fitted after the aggregation of the two clusters. The steps are iterated until the BIC of the resulting model is lower than the unconstrained model. The algorithm stops when all the items are grouped together.
12 An application We applied the model to a sample of n =5, 470 males taken from the dataset illustrated above; we used the estimated number of live births in the cohort year 1953 as reported by Prime et al. (2001). For a number of classes between 1 and 7 we obtain k l k r k BIC k 1 21, , , , , , , , , , , , , , 036 We choose k =5states as we have the smallest BIC. RSS2004 p.12/19
13 RSS2004 p.13/19 Choice of the clusters Using the hierarchical algorithm the best fit (BIC =35, 433) was for the following cluster aggregations for each of the the 10 typology of crimes and the estimation of β s. latent trait Offence s category (j) β j Violence against the person X Sexual offences X Burglary X Robbery X Theft and handling stolen goods X Fraud and Forgery X Criminal Damage X Drug Offences X Motoring Offences X Other offences X 7.493
14 RSS2004 p.14/19 Estimated α s parameters Values of the estimated tendencies of the subject for each latent state in every subgroup c α 1 α 2 α
15 Estimate of π and Π Initial probabilities π c π 1 π 2 π 3 π 4 π Transition probabilities π cd s of the Markov Chain are the following c RSS2004 p.15/19
16 RSS2004 p.16/19 Advantages of the proposed methodology We achieve parsimonious description of the dynamic process underlying the data; the approach is based on general population sample and not on an offenderbased sample as in other studies; it allows to estimate a waste choice of models and to choose the best one going to the simple latent class model to the constrained model with subgroups; it can provide important information for policy, such as incarceration or incapacitation policy against the offenders.
17 RSS2004 p.17/19 Future extensions Constraint the probabilities λ j c s to be equal to 0 for a latent class so that this class may be identified as that of nonoffensive subjects; consider also models in which the transition probabilities may vary with age (non homogeneous of the Markov chains); consider restriced models in which the transition matrix has a particular structure (e.g. triangular, symmetric); include explanatory variables, such as gender or race, in the model.
18 RSS2004 p.18/19 References Bijleveld, C. J. H., and Mooijaart, A. Neerlandica, 57, 3, (2003). Latent Markov Modelling of Recidivism Data. Statistica (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. R. Statist. Soc. series B, 39, Dempster, A. P., Laird, N. M. and Rubin, D. B. (1996). Using Bootstrap Likelihood Ratios in Finite Mixture Models. J. R. Statist. Soc., B, 58, Feng, Z. and McCulloch, C. E. (2004). Identifying Patterns and Pathways of Offending Behaviour: A New Approach to Typologies of Crime. European Journal of Criminology, 1, Francis, B., Soothill, K. and Fligelstone, R. Kass R. E. and Raftery A. (1995). Bayes factors. Journal of the American Statistical Association, 90 (430), Lazarsfeld, P. F. and Henry, N. W (1968). Latent Structure Analysis. Boston: Houghton Mifflin. Levinson S. E., Rabiner, L. R. and Sondhi, M. M. (1983). An introduction to an application of theory of probabilistic functions of a Markov process to automatic speech recognition. Bell System Thechnical Journal, 62, (1991). Semiparametric estimation in the Rasch model and related exponential response models, including a simple latent class model for item analysis. Journal of the American Statistical Association, 86, Lindsay, B., Clogg, C. and Grego, J.
19 RSS2004 p.19/19 (1995). Patterns of drug use among white institutionalized delinquents in Georgia. Evidence from a latent class analysis. Journal of Drug Education, 25, McCutcheon, A. L. and Thomas, G. (1997). Hidden Markov and Other Models for Discretevalued Time Series. London: Chapman & Hall. MacDonald I. and Zucchini W. McLachlan, G. J. and Peel, D. (2000). Finite Mixture Models, New York, John and Wiley. (1998). Offenders Index Codebook, London: Home Office. Available at Research development and Statistics Directorate (2001). Criminal careers of those born between 1953 and Statistical Bulletin 4/01. London: Home Office. Prime, J., White, S., Liriano, S. and Patel, K. Rasch, G. (1961). On general laws and the meaning of measurement in psychology, Proceedings of the IV Berkeley Symposium on Mathematical Statistics and Probability, 4, (1973). Panel Analysis: Latent Probability Models for Attitudes and Behavior Processes. Amsterdam: Elsevier. Wiggins, L. M.
Item selection by latent classbased methods: an application to nursing homes evaluation
Item selection by latent classbased methods: an application to nursing homes evaluation Francesco Bartolucci, Giorgio E. Montanari, Silvia Pandolfi 1 Department of Economics, Finance and Statistics University
More informationIntroduction to latent variable models
Introduction to latent variable models lecture 1 Francesco Bartolucci Department of Economics, Finance and Statistics University of Perugia, IT bart@stat.unipg.it Outline [2/24] Latent variables and their
More informationUsing MixturesofDistributions models to inform farm size selection decisions in representative farm modelling. Philip Kostov and Seamus McErlean
Using MixturesofDistributions models to inform farm size selection decisions in representative farm modelling. by Philip Kostov and Seamus McErlean Working Paper, Agricultural and Food Economics, Queen
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
More informationRobotics 2 Clustering & EM. Giorgio Grisetti, Cyrill Stachniss, Kai Arras, Maren Bennewitz, Wolfram Burgard
Robotics 2 Clustering & EM Giorgio Grisetti, Cyrill Stachniss, Kai Arras, Maren Bennewitz, Wolfram Burgard 1 Clustering (1) Common technique for statistical data analysis to detect structure (machine learning,
More informationClustering  example. Given some data x i X Find a partitioning of the data into k disjunctive clusters Example: kmeans clustering
Clustering  example Graph Mining and Graph Kernels Given some data x i X Find a partitioning of the data into k disjunctive clusters Example: kmeans clustering x!!!!8!!! 8 x 1 1 Clustering  example
More informationThe Start of a Criminal Career: Does the Type of Debut Offence Predict Future Offending? Research Report 77. Natalie Owen & Christine Cooper
The Start of a Criminal Career: Does the Type of Debut Offence Predict Future Offending? Research Report 77 Natalie Owen & Christine Cooper November 2013 Contents Executive Summary... 3 Introduction...
More informationBayesian networks  Timeseries models  Apache Spark & Scala
Bayesian networks  Timeseries models  Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup  November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly
More informationChenfeng Xiong (corresponding), University of Maryland, College Park (cxiong@umd.edu)
Paper Author (s) Chenfeng Xiong (corresponding), University of Maryland, College Park (cxiong@umd.edu) Lei Zhang, University of Maryland, College Park (lei@umd.edu) Paper Title & Number Dynamic Travel
More informationLecture 10: Sequential Data Models
CSC2515 Fall 2007 Introduction to Machine Learning Lecture 10: Sequential Data Models 1 Example: sequential data Until now, considered data to be i.i.d. Turn attention to sequential data Timeseries: stock
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationCrime Location Crime Type Month Year Betting Shop Criminal Damage April 2010 Betting Shop Theft April 2010 Betting Shop Assault April 2010
Crime Location Crime Type Month Year Betting Shop Theft April 2010 Betting Shop Assault April 2010 Betting Shop Theft April 2010 Betting Shop Theft April 2010 Betting Shop Assault April 2010 Betting Shop
More informationParametric Models Part I: Maximum Likelihood and Bayesian Density Estimation
Parametric Models Part I: Maximum Likelihood and Bayesian Density Estimation Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2015 CS 551, Fall 2015
More informationA general statistical framework for assessing Granger causality
A general statistical framework for assessing Granger causality The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published
More informationClass #6: Nonlinear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris
Class #6: Nonlinear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Nonlinear classification Linear Support Vector Machines
More informationItem Response Theory in R using Package ltm
Item Response Theory in R using Package ltm Dimitris Rizopoulos Department of Biostatistics, Erasmus University Medical Center, the Netherlands d.rizopoulos@erasmusmc.nl Department of Statistics and Mathematics
More informationLogistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
More informationReject Inference in Credit Scoring. JieMen Mok
Reject Inference in Credit Scoring JieMen Mok BMI paper January 2009 ii Preface In the Master programme of Business Mathematics and Informatics (BMI), it is required to perform research on a business
More informationModelBased Cluster Analysis for Web Users Sessions
ModelBased Cluster Analysis for Web Users Sessions George Pallis, Lefteris Angelis, and Athena Vakali Department of Informatics, Aristotle University of Thessaloniki, 54124, Thessaloniki, Greece gpallis@ccf.auth.gr
More informationCell Phone based Activity Detection using Markov Logic Network
Cell Phone based Activity Detection using Markov Logic Network Somdeb Sarkhel sxs104721@utdallas.edu 1 Introduction Mobile devices are becoming increasingly sophisticated and the latest generation of smart
More informationPackage MixGHD. June 26, 2015
Type Package Package MixGHD June 26, 2015 Title Model Based Clustering, Classification and Discriminant Analysis Using the Mixture of Generalized Hyperbolic Distributions Version 1.7 Date 2015615 Author
More informationStatistics in Retail Finance. Chapter 6: Behavioural models
Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics: Behavioural
More informationConditional Random Fields: An Introduction
Conditional Random Fields: An Introduction Hanna M. Wallach February 24, 2004 1 Labeling Sequential Data The task of assigning label sequences to a set of observation sequences arises in many fields, including
More informationCourse: Model, Learning, and Inference: Lecture 5
Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.
More informationLinear Classification. Volker Tresp Summer 2015
Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong
More informationNote on the EM Algorithm in Linear Regression Model
International Mathematical Forum 4 2009 no. 38 18831889 Note on the M Algorithm in Linear Regression Model JiXia Wang and Yu Miao College of Mathematics and Information Science Henan Normal University
More informationClassifying Galaxies using a datadriven approach
Classifying Galaxies using a datadriven approach Supervisor : Prof. David van Dyk Department of Mathematics Imperial College London London, April 2015 Outline The Classification Problem 1 The Classification
More informationCurriculum Vitae of Francesco Bartolucci
Curriculum Vitae of Francesco Bartolucci Department of Economics, Finance and Statistics University of Perugia Via A. Pascoli, 20 06123 Perugia (IT) email: bart@stat.unipg.it http://www.stat.unipg.it/bartolucci
More informationThe Probit Link Function in Generalized Linear Models for Data Mining Applications
Journal of Modern Applied Statistical Methods Copyright 2013 JMASM, Inc. May 2013, Vol. 12, No. 1, 164169 1538 9472/13/$95.00 The Probit Link Function in Generalized Linear Models for Data Mining Applications
More informationASC 076 INTRODUCTION TO SOCIAL AND CRIMINAL PSYCHOLOGY
DIPLOMA IN CRIME MANAGEMENT AND PREVENTION COURSES DESCRIPTION ASC 075 INTRODUCTION TO SOCIOLOGY AND ANTHROPOLOGY Defining Sociology and Anthropology, Emergence of Sociology, subject matter and subdisciplines.
More informationSAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More informationStructural Equation Models: Mixture Models
Structural Equation Models: Mixture Models Jeroen K. Vermunt Department of Methodology and Statistics Tilburg University Jay Magidson Statistical Innovations Inc. 1 Introduction This article discusses
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationLikelihood Approaches for Trial Designs in Early Phase Oncology
Likelihood Approaches for Trial Designs in Early Phase Oncology Clinical Trials Elizabeth GarrettMayer, PhD Cody Chiuzan, PhD Hollings Cancer Center Department of Public Health Sciences Medical University
More informationStatistical Analysis with Missing Data
Statistical Analysis with Missing Data Second Edition RODERICK J. A. LITTLE DONALD B. RUBIN WILEY INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents Preface PARTI OVERVIEW AND BASIC APPROACHES
More informationSupport Vector Machines with Clustering for Training with Very Large Datasets
Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France theodoros.evgeniou@insead.fr Massimiliano
More informationIntroduction to mixed model and missing data issues in longitudinal studies
Introduction to mixed model and missing data issues in longitudinal studies Hélène JacqminGadda INSERM, U897, Bordeaux, France Inserm workshop, St Raphael Outline of the talk I Introduction Mixed models
More informationMS1b Statistical Data Mining
MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to
More informationStatistical Machine Learning from Data
Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Gaussian Mixture Models Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique
More informationSTATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 SigmaRestricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More informationChristfried Webers. Canberra February June 2015
c Statistical Group and College of Engineering and Computer Science Canberra February June (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 829 c Part VIII Linear Classification 2 Logistic
More informationThe Exponential Family
The Exponential Family David M. Blei Columbia University November 3, 2015 Definition A probability density in the exponential family has this form where p.x j / D h.x/ expf > t.x/ a./g; (1) is the natural
More informationAutomated Hierarchical Mixtures of Probabilistic Principal Component Analyzers
Automated Hierarchical Mixtures of Probabilistic Principal Component Analyzers Ting Su tsu@ece.neu.edu Jennifer G. Dy jdy@ece.neu.edu Department of Electrical and Computer Engineering, Northeastern University,
More informationData a systematic approach
Pattern Discovery on Australian Medical Claims Data a systematic approach Ah Chung Tsoi Senior Member, IEEE, Shu Zhang, Markus Hagenbuchner Member, IEEE Abstract The national health insurance system in
More informationDATA ANALYTICS USING R
DATA ANALYTICS USING R Duration: 90 Hours Intended audience and scope: The course is targeted at fresh engineers, practicing engineers and scientists who are interested in learning and understanding data
More informationMessagepassing sequential detection of multiple change points in networks
Messagepassing sequential detection of multiple change points in networks Long Nguyen, Arash Amini Ram Rajagopal University of Michigan Stanford University ISIT, Boston, July 2012 Nguyen/Amini/Rajagopal
More informationModelBased Recursive Partitioning for Detecting Interaction Effects in Subgroups
ModelBased Recursive Partitioning for Detecting Interaction Effects in Subgroups Achim Zeileis, Torsten Hothorn, Kurt Hornik http://eeecon.uibk.ac.at/~zeileis/ Overview Motivation: Trees, leaves, and
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
More informationDetection of changes in variance using binary segmentation and optimal partitioning
Detection of changes in variance using binary segmentation and optimal partitioning Christian Rohrbeck Abstract This work explores the performance of binary segmentation and optimal partitioning in the
More informationProbabilistic trust models in network security
UNIVERSITY OF SOUTHAMPTON Probabilistic trust models in network security by Ehab M. ElSalamouny A thesis submitted in partial fulfillment for the degree of Doctor of Philosophy in the Faculty of Engineering
More informationAn introduction to Hidden Markov Models
An introduction to Hidden Markov Models Christian Kohlschein Abstract Hidden Markov Models (HMM) are commonly defined as stochastic finite state machines. Formally a HMM can be described as a 5tuple Ω
More informationMACHINE LEARNING IN HIGH ENERGY PHYSICS
MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models  part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK2800 Kgs. Lyngby
More informationHypothesis Testing. 1 Introduction. 2 Hypotheses. 2.1 Null and Alternative Hypotheses. 2.2 Simple vs. Composite. 2.3 OneSided and TwoSided Tests
Hypothesis Testing 1 Introduction This document is a simple tutorial on hypothesis testing. It presents the basic concepts and definitions as well as some frequently asked questions associated with hypothesis
More informationStatistics Graduate Courses
Statistics Graduate Courses STAT 7002Topics in StatisticsBiological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
More informationMaximum Likelihood Estimation
Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for
More informationFitting Subjectspecific Curves to Grouped Longitudinal Data
Fitting Subjectspecific Curves to Grouped Longitudinal Data Djeundje, Viani HeriotWatt University, Department of Actuarial Mathematics & Statistics Edinburgh, EH14 4AS, UK Email: vad5@hw.ac.uk Currie,
More informationA HYBRID GENETIC ALGORITHM FOR THE MAXIMUM LIKELIHOOD ESTIMATION OF MODELS WITH MULTIPLE EQUILIBRIA: A FIRST REPORT
New Mathematics and Natural Computation Vol. 1, No. 2 (2005) 295 303 c World Scientific Publishing Company A HYBRID GENETIC ALGORITHM FOR THE MAXIMUM LIKELIHOOD ESTIMATION OF MODELS WITH MULTIPLE EQUILIBRIA:
More informationThese slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop
Music and Machine Learning (IFT6080 Winter 08) Prof. Douglas Eck, Université de Montréal These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher
More informationFemale offenders and child dependents. Ministry of Justice
Female offenders and child dependents Ministry of Justice 08 October 2015 Previous estimates of the proportion of female offenders who have child dependents at the time of their disposal have been based
More informationA crash course in probability and Naïve Bayes classification
Probability theory A crash course in probability and Naïve Bayes classification Chapter 9 Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s
More informationMachine Learning and Data Mining. Clustering. (adapted from) Prof. Alexander Ihler
Machine Learning and Data Mining Clustering (adapted from) Prof. Alexander Ihler Unsupervised learning Supervised learning Predict target value ( y ) given features ( x ) Unsupervised learning Understand
More informationHealth Status Monitoring Through Analysis of Behavioral Patterns
Health Status Monitoring Through Analysis of Behavioral Patterns Tracy Barger 1, Donald Brown 1, and Majd Alwan 2 1 University of Virginia, Systems and Information Engineering, Charlottesville, VA 2 University
More informationAPPLIED MISSING DATA ANALYSIS
APPLIED MISSING DATA ANALYSIS Craig K. Enders Series Editor's Note by Todd D. little THE GUILFORD PRESS New York London Contents 1 An Introduction to Missing Data 1 1.1 Introduction 1 1.2 Chapter Overview
More informationNominal and ordinal logistic regression
Nominal and ordinal logistic regression April 26 Nominal and ordinal logistic regression Our goal for today is to briefly go over ways to extend the logistic regression model to the case where the outcome
More informationBayesian Statistics in One Hour. Patrick Lam
Bayesian Statistics in One Hour Patrick Lam Outline Introduction Bayesian Models Applications Missing Data Hierarchical Models Outline Introduction Bayesian Models Applications Missing Data Hierarchical
More informationBayesian logistic betting strategy against probability forecasting. Akimichi Takemura, Univ. Tokyo. November 12, 2012
Bayesian logistic betting strategy against probability forecasting Akimichi Takemura, Univ. Tokyo (joint with Masayuki Kumon, Jing Li and Kei Takeuchi) November 12, 2012 arxiv:1204.3496. To appear in Stochastic
More informationGerry Hobbs, Department of Statistics, West Virginia University
Decision Trees as a Predictive Modeling Method Gerry Hobbs, Department of Statistics, West Virginia University Abstract Predictive modeling has become an important area of interest in tasks such as credit
More informationQDquaderni. UPDRES User Profiling for a Dynamic REcommendation System E. Messina, D. Toscani, F. Archetti. university of milano bicocca
A01 084/01 university of milano bicocca QDquaderni department of informatics, systems and communication UPDRES User Profiling for a Dynamic REcommendation System E. Messina, D. Toscani, F. Archetti research
More informationA tutorial on Bayesian model selection. and on the BMSL Laplace approximation
A tutorial on Bayesian model selection and on the BMSL Laplace approximation JeanLuc (schwartz@icp.inpg.fr) Institut de la Communication Parlée, CNRS UMR 5009, INPGUniversité Stendhal INPG, 46 Av. Félix
More informationQuestion 2 Naïve Bayes (16 points)
Question 2 Naïve Bayes (16 points) About 2/3 of your email is spam so you downloaded an open source spam filter based on word occurrences that uses the Naive Bayes classifier. Assume you collected the
More informationIntroduction to Machine Learning
Introduction to Machine Learning Brown University CSCI 1950F, Spring 2012 Prof. Erik Sudderth Lecture 5: Decision Theory & ROC Curves Gaussian ML Estimation Many figures courtesy Kevin Murphy s textbook,
More informationA Bayesian Antidote Against Strategy Sprawl
A Bayesian Antidote Against Strategy Sprawl Benjamin Scheibehenne (benjamin.scheibehenne@unibas.ch) University of Basel, Missionsstrasse 62a 4055 Basel, Switzerland & Jörg Rieskamp (joerg.rieskamp@unibas.ch)
More informationUW CSE Technical Report 030601 Probabilistic Bilinear Models for AppearanceBased Vision
UW CSE Technical Report 030601 Probabilistic Bilinear Models for AppearanceBased Vision D.B. Grimes A.P. Shon R.P.N. Rao Dept. of Computer Science and Engineering University of Washington Seattle, WA
More informationPoisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study loglinear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
More informationLatent Class (Finite Mixture) Segments How to find them and what to do with them
Latent Class (Finite Mixture) Segments How to find them and what to do with them Jay Magidson Statistical Innovations Inc. Belmont, MA USA www.statisticalinnovations.com Sensometrics 2010, Rotterdam Overview
More informationAn Outcome Analysis of Connecticut s Halfway House Programs
An Outcome Analysis of Connecticut s Halfway House Programs Stephen M. Cox, Ph.D. Professor Department of Criminology and Criminal Justice Central Connecticut State University Study Impetus and Purpose
More informationReview of the Methods for Handling Missing Data in. Longitudinal Data Analysis
Int. Journal of Math. Analysis, Vol. 5, 2011, no. 1, 113 Review of the Methods for Handling Missing Data in Longitudinal Data Analysis Michikazu Nakai and Weiming Ke Department of Mathematics and Statistics
More informationLanguage Modeling. Chapter 1. 1.1 Introduction
Chapter 1 Language Modeling (Course notes for NLP by Michael Collins, Columbia University) 1.1 Introduction In this chapter we will consider the the problem of constructing a language model from a set
More informationPattern Analysis. Logistic Regression. 12. Mai 2009. Joachim Hornegger. Chair of Pattern Recognition Erlangen University
Pattern Analysis Logistic Regression 12. Mai 2009 Joachim Hornegger Chair of Pattern Recognition Erlangen University Pattern Analysis 2 / 43 1 Logistic Regression Posteriors and the Logistic Function Decision
More informationModeling and Analysis of Call Center Arrival Data: A Bayesian Approach
Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach Refik Soyer * Department of Management Science The George Washington University M. Murat Tarimcilar Department of Management Science
More informationBayesian Statistics: Indian Buffet Process
Bayesian Statistics: Indian Buffet Process Ilker Yildirim Department of Brain and Cognitive Sciences University of Rochester Rochester, NY 14627 August 2012 Reference: Most of the material in this note
More informationHypothesis testing and the error of the third kind
Psychological Test and Assessment Modeling, Volume 54, 22 (), 999 Hypothesis testing and the error of the third kind Dieter Rasch Abstract In this note it is shown that the concept of an error of the
More informationCHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS
Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships
More informationA Bootstrap MetropolisHastings Algorithm for Bayesian Analysis of Big Data
A Bootstrap MetropolisHastings Algorithm for Bayesian Analysis of Big Data Faming Liang University of Florida August 9, 2015 Abstract MCMC methods have proven to be a very powerful tool for analyzing
More informationTutorial on variational approximation methods. Tommi S. Jaakkola MIT AI Lab
Tutorial on variational approximation methods Tommi S. Jaakkola MIT AI Lab tommi@ai.mit.edu Tutorial topics A bit of history Examples of variational methods A brief intro to graphical models Variational
More informationStandard errors of marginal effects in the heteroskedastic probit model
Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In nonlinear regression models, such as the heteroskedastic
More informationCS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott
More informationPackage EstCRM. July 13, 2015
Version 1.4 Date 2015711 Package EstCRM July 13, 2015 Title Calibrating Parameters for the Samejima's Continuous IRT Model Author Cengiz Zopluoglu Maintainer Cengiz Zopluoglu
More informationA mixture model for random graphs
A mixture model for random graphs JJ Daudin, F. Picard, S. Robin robin@inapg.inra.fr UMR INAPG / ENGREF / INRA, Paris Mathématique et Informatique Appliquées Examples of networks. Social: Biological:
More informationCHAPTER 2 Estimating Probabilities
CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a
More informationOverview Classes. 123 Logistic regression (5) 193 Building and applying logistic regression (6) 263 Generalizations of logistic regression (7)
Overview Classes 123 Logistic regression (5) 193 Building and applying logistic regression (6) 263 Generalizations of logistic regression (7) 24 Loglinear models (8) 54 1517 hrs; 5B02 Building and
More informationCentral Statistics Office (CSO) Recorded Crime Statistics Frequently Asked Questions
Central Statistics Office (CSO) Recorded Crime Statistics Frequently Asked Questions 26th June 2014 Introduction. The purposes of this document is to address some commonly asked questions about CSO recorded
More informationSocial Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
More informationMETHOD OF MOMENTS LEARNING FOR LEFTTORIGHT HIDDEN MARKOV MODELS
METHOD OF MOMENTS LEARNING FOR LEFTTORIGHT HIDDEN MARKOV MODELS Y. Cem Subakan [, Johannes Traa ], Paris Smaragdis [,],\, Daniel Hsu ]] [ UIUC Computer Science Department, ]] Columbia University Computer
More information6. If there is no improvement of the categories after several steps, then choose new seeds using another criterion (e.g. the objects near the edge of
Clustering Clustering is an unsupervised learning method: there is no target value (class label) to be predicted, the goal is finding common patterns or grouping similar examples. Differences between models/algorithms
More informationQuestionnaire: Domestic (Gender and Family) Violence Interventions
Questionnaire: Domestic (Gender and Family) Violence Interventions STRENGTHENING TRANSNATIONAL APPROACHES TO REDUCING REOFFENDING (STARR) On behalf of The Institute of Criminology STRENGTHENING TRANSNATIONAL
More informationAn Extension of the CHAID Treebased Segmentation Algorithm to Multiple Dependent Variables
An Extension of the CHAID Treebased Segmentation Algorithm to Multiple Dependent Variables Jay Magidson 1 and Jeroen K. Vermunt 2 1 Statistical Innovations Inc., 375 Concord Avenue, Belmont, MA 02478,
More informationExample: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation:  Feature vector X,  qualitative response Y, taking values in C
More informationMethods of Data Analysis Working with probability distributions
Methods of Data Analysis Working with probability distributions Week 4 1 Motivation One of the key problems in nonparametric data analysis is to create a good model of a generating probability distribution,
More information