A hidden Markov model for criminal behaviour classification


 Lorena Kennedy
 1 years ago
 Views:
Transcription
1 RSS2004 p.1/19 A hidden Markov model for criminal behaviour classification Francesco Bartolucci, Institute of economic sciences, Urbino University, Italy. Fulvia Pennoni, Department of Statistics, University of Florence, Italy.
2 RSS2004 p.2/19 Background Analysis of criminal behaviour: we want to model offending patterns as well as taking into account the nature of offending and the sequence of offence type; criminal histories recorded as official histories: England and Wales Offenders Index which is a court based record of the criminal histories of all offenders in England and Wales from 1963 to the current day; general population sample of n =5, 470 individuals paroled from the cohort of those born in 1953, and followed through to 1993; offences are combined into J =10major categories described in the Offendex Index Codebook (1998); following Francis et al. (2004) we have define T =6time windows or age strips:1015,1620, 2125, 2630,
3 RSS2004 p.3/19 Univariate Latent Markov model Used by Bijleveld and Mooijaart (2003): the offending pattern of a subject within strip age t, t =,...,T is represented by X t a single discrete random variable; {X t } depends only on a random process {C t }; {C t } follows a firstorder homogeneous Markov chain with k states, initial probabilities π c s and transition probabilities π c1 c 2 ; the joint distribution of {X t } may be expressed as p(x 1 = x 1,...,X T = x T )= φ x1 c 1 π c1 φ x2 c 2 π c1 c 2 φ xt c T π ct 1 c T, c 2 c T c 1 where φ x c = p(x t = x C t = c).
4 RSS2004 p.4/19 Multivariate Extension X tj is a binary random variable equal to 1 if he/she is convicted for offence of type j within the strip age t and to 0 otherwise; we assume local independence i.e. that for t =1,..., T, X tj are conditionally independent given C t : φx c = p(x t = x C t = c) = J j=1 λ x j j c (1 λ j c) 1 x j, where λ j c = p(x tj =1 C t = c), X t =(X t1,,x tj ) and x j denotes the j element of the vector x.
5 RSS2004 p.5/19 Restricted version of the model (unidimensional Rasch) We assume that for each type of offence we have logit(λ j c )=α c + β j, (1) where α c is the tendency to commit crimes of the subject in the latent class c (i.e. individual characteristic) β j is the easiness to commit crime of type j; it allows for an appropriate labelling of the latent classes to order the latent classes λ j 1 <= <= λ j k, j =1,...,J, such constrain is used to formulate a latent class version of the Rasch (1961) model which is wellknown in the Psychometric literature.
6 RSS2004 p.6/19 Restricted version of the model (multidimensional Rasch) The previous model assumes that each type of offence has the same latent trait: this may be too much restrictive; we consider that the crimes may be partitioned into s homogenous subgroups so that logit(λ j c )= s δ jd α cd + β j, (2) d=1 where α cd is the tendency of the subject in the latent class c to commit crimes in the subgroup d; δ jd is equal to 1 if the crime j is in the subgroup d and to 0 otherwise; we can classify the offences into groups where crimes belonging to the same group have the same latent trait.
7 RSS2004 p.7/19 Likelihood inference The loglikelihood of the model for an observed cohort of n subjects is l(θ) = n log[l i (θ)], i=1 where θ is the notation for all the parameters, L i (θ) is the function p(x i1,...,x it ) defined evaluated at θ. L i (θ) may be computed through the wellknown recursions in the hidden Markov literature (see Levinson et al., 1983, and MacDonald and Zucchini, 1997, Sec. 2.2); l(θ) is maximized with the EM algorithm which requires the loglikelihood of the complete data l (θ).
8 RSS2004 p.8/19 The complete data loglikelihood may be expressed as l (θ) = v 1c log π c + u c1 c 2 log π c1 c 2 + c c 1 c 2 v itc {x itj log λ cj +(1 x itj )log(1 λ cj )}, i t c j where v itc is a dummy variable, referred to the ith subject, which is equal to 1 if C t = c and to 0 otherwise, v tc = i v itc and u c1 c 2 is the number of transitions from the c 1 th to the c 2 th state.
9 RSS2004 p.9/19 EM algorithm E : computes the conditional expected value of l (θ), given the observed data and the current value of the parameters. M : updates the parameter estimates by maximizing the expected value of l (θ) computed above. When the model is constrained (unidimensional or multidimensional Rasch) the parameters α cd and β j are estimated by fitting a logistic model with a suitable design matrix Z defined according to the model of interest to the data.
10 RSS2004 p.10/19 Choice of the number of classes (k) The optimal number of latent classes can be chosen with the likelihood ratio between the model with k states and that with k +1 states, D k = 2(ˆl k ˆl k+1 ), for increasing values of k; or using the Bayesian Information Criterion (Kass and Raftery, 1995) defined as BIC k = 2l k + r k log(n) where r k is the number of parameters in the model with k states. According to this strategy, the optimal number of states is the one for that BIC k is minimum.
11 RSS2004 p.11/19 Choice of the number of latent traits The crimes are clustered using a hierarchical algorithm. At each step the algorithm aggregates the two cluster of crimes which are the closest in terms of deviance between the model fitted at the previous step and the multidimensional Rasch model fitted after the aggregation of the two clusters. The steps are iterated until the BIC of the resulting model is lower than the unconstrained model. The algorithm stops when all the items are grouped together.
12 An application We applied the model to a sample of n =5, 470 males taken from the dataset illustrated above; we used the estimated number of live births in the cohort year 1953 as reported by Prime et al. (2001). For a number of classes between 1 and 7 we obtain k l k r k BIC k 1 21, , , , , , , , , , , , , , 036 We choose k =5states as we have the smallest BIC. RSS2004 p.12/19
13 RSS2004 p.13/19 Choice of the clusters Using the hierarchical algorithm the best fit (BIC =35, 433) was for the following cluster aggregations for each of the the 10 typology of crimes and the estimation of β s. latent trait Offence s category (j) β j Violence against the person X Sexual offences X Burglary X Robbery X Theft and handling stolen goods X Fraud and Forgery X Criminal Damage X Drug Offences X Motoring Offences X Other offences X 7.493
14 RSS2004 p.14/19 Estimated α s parameters Values of the estimated tendencies of the subject for each latent state in every subgroup c α 1 α 2 α
15 Estimate of π and Π Initial probabilities π c π 1 π 2 π 3 π 4 π Transition probabilities π cd s of the Markov Chain are the following c RSS2004 p.15/19
16 RSS2004 p.16/19 Advantages of the proposed methodology We achieve parsimonious description of the dynamic process underlying the data; the approach is based on general population sample and not on an offenderbased sample as in other studies; it allows to estimate a waste choice of models and to choose the best one going to the simple latent class model to the constrained model with subgroups; it can provide important information for policy, such as incarceration or incapacitation policy against the offenders.
17 RSS2004 p.17/19 Future extensions Constraint the probabilities λ j c s to be equal to 0 for a latent class so that this class may be identified as that of nonoffensive subjects; consider also models in which the transition probabilities may vary with age (non homogeneous of the Markov chains); consider restriced models in which the transition matrix has a particular structure (e.g. triangular, symmetric); include explanatory variables, such as gender or race, in the model.
18 RSS2004 p.18/19 References Bijleveld, C. J. H., and Mooijaart, A. Neerlandica, 57, 3, (2003). Latent Markov Modelling of Recidivism Data. Statistica (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. R. Statist. Soc. series B, 39, Dempster, A. P., Laird, N. M. and Rubin, D. B. (1996). Using Bootstrap Likelihood Ratios in Finite Mixture Models. J. R. Statist. Soc., B, 58, Feng, Z. and McCulloch, C. E. (2004). Identifying Patterns and Pathways of Offending Behaviour: A New Approach to Typologies of Crime. European Journal of Criminology, 1, Francis, B., Soothill, K. and Fligelstone, R. Kass R. E. and Raftery A. (1995). Bayes factors. Journal of the American Statistical Association, 90 (430), Lazarsfeld, P. F. and Henry, N. W (1968). Latent Structure Analysis. Boston: Houghton Mifflin. Levinson S. E., Rabiner, L. R. and Sondhi, M. M. (1983). An introduction to an application of theory of probabilistic functions of a Markov process to automatic speech recognition. Bell System Thechnical Journal, 62, (1991). Semiparametric estimation in the Rasch model and related exponential response models, including a simple latent class model for item analysis. Journal of the American Statistical Association, 86, Lindsay, B., Clogg, C. and Grego, J.
19 RSS2004 p.19/19 (1995). Patterns of drug use among white institutionalized delinquents in Georgia. Evidence from a latent class analysis. Journal of Drug Education, 25, McCutcheon, A. L. and Thomas, G. (1997). Hidden Markov and Other Models for Discretevalued Time Series. London: Chapman & Hall. MacDonald I. and Zucchini W. McLachlan, G. J. and Peel, D. (2000). Finite Mixture Models, New York, John and Wiley. (1998). Offenders Index Codebook, London: Home Office. Available at Research development and Statistics Directorate (2001). Criminal careers of those born between 1953 and Statistical Bulletin 4/01. London: Home Office. Prime, J., White, S., Liriano, S. and Patel, K. Rasch, G. (1961). On general laws and the meaning of measurement in psychology, Proceedings of the IV Berkeley Symposium on Mathematical Statistics and Probability, 4, (1973). Panel Analysis: Latent Probability Models for Attitudes and Behavior Processes. Amsterdam: Elsevier. Wiggins, L. M.
Studying employment pathways of graduates by a latent Markov model
Studying employment pathways of graduates by a latent Markov model Fulvia Pennoni Abstract Motivated by an application to a longitudinal dataset deriving from administrative data which concern labour market
More informationItem selection by latent classbased methods: an application to nursing homes evaluation
Item selection by latent classbased methods: an application to nursing homes evaluation Francesco Bartolucci, Giorgio E. Montanari, Silvia Pandolfi 1 Department of Economics, Finance and Statistics University
More informationIntroduction to latent variable models
Introduction to latent variable models lecture 1 Francesco Bartolucci Department of Economics, Finance and Statistics University of Perugia, IT bart@stat.unipg.it Outline [2/24] Latent variables and their
More informationHidden Markov Models: An Approach to Sequence Analysis in Population Studies
Hidden Markov Models: An Approach to Sequence Analysis in Population Studies Danilo Bolano National Center of Competence in Research LIVES Institute for Demographic and Life Course Studies University of
More informationA Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Abstract
!!" # $&%('),.0/1235417698:4= 12@?>ABAC8ED.03GFHI J&KL.M>I = NPO>3Q/>= M,%('>)>A('(RS$B$&%BTU8EVXWY$&A>Z[?('\RX%Y$]W>\U8E^7_a`bVXWY$&A>Z[?('\RX)>?BT(' A Gentle Tutorial of the EM Algorithm and
More informationPart II: Web Content Mining Chapter 3: Clustering
Part II: Web Content Mining Chapter 3: Clustering Learning by Example and Clustering Hierarchical Agglomerative Clustering KMeans Clustering ProbabilityBased Clustering Collaborative Filtering Slides
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
More informationUsing MixturesofDistributions models to inform farm size selection decisions in representative farm modelling. Philip Kostov and Seamus McErlean
Using MixturesofDistributions models to inform farm size selection decisions in representative farm modelling. by Philip Kostov and Seamus McErlean Working Paper, Agricultural and Food Economics, Queen
More informationRobotics 2 Clustering & EM. Giorgio Grisetti, Cyrill Stachniss, Kai Arras, Maren Bennewitz, Wolfram Burgard
Robotics 2 Clustering & EM Giorgio Grisetti, Cyrill Stachniss, Kai Arras, Maren Bennewitz, Wolfram Burgard 1 Clustering (1) Common technique for statistical data analysis to detect structure (machine learning,
More informationRasch Models in Latent Classes: An Integration of Two Approaches to Item Analysis
Rasch Models in Latent Classes: An Integration of Two Approaches to Item Analysis Jürgen Rost University of Kiel A model is proposed that combines the theoretical strength of the Rasch model with the heuristic
More informationClustering  example. Given some data x i X Find a partitioning of the data into k disjunctive clusters Example: kmeans clustering
Clustering  example Graph Mining and Graph Kernels Given some data x i X Find a partitioning of the data into k disjunctive clusters Example: kmeans clustering x!!!!8!!! 8 x 1 1 Clustering  example
More informationBayesian networks  Timeseries models  Apache Spark & Scala
Bayesian networks  Timeseries models  Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup  November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly
More informationMachine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall
Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume
More informationParametric Models Part I: Maximum Likelihood and Bayesian Density Estimation
Parametric Models Part I: Maximum Likelihood and Bayesian Density Estimation Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2015 CS 551, Fall 2015
More informationClass #6: Nonlinear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris
Class #6: Nonlinear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Nonlinear classification Linear Support Vector Machines
More informationItem Response Theory in R using Package ltm
Item Response Theory in R using Package ltm Dimitris Rizopoulos Department of Biostatistics, Erasmus University Medical Center, the Netherlands d.rizopoulos@erasmusmc.nl Department of Statistics and Mathematics
More informationA general statistical framework for assessing Granger causality
A general statistical framework for assessing Granger causality The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published
More informationRegime Switching Models: An Example for a Stock Market Index
Regime Switching Models: An Example for a Stock Market Index Erik Kole Econometric Institute, Erasmus School of Economics, Erasmus University Rotterdam April 2010 In this document, I discuss in detail
More informationLiterature: (6) Pattern Recognition and Machine Learning Christopher M. Bishop MIT Press
Literature: (6) Pattern Recognition and Machine Learning Christopher M. Bishop MIT Press ! Generative statistical models try to learn how to generate data from latent variables (for example class labels).
More informationThe Start of a Criminal Career: Does the Type of Debut Offence Predict Future Offending? Research Report 77. Natalie Owen & Christine Cooper
The Start of a Criminal Career: Does the Type of Debut Offence Predict Future Offending? Research Report 77 Natalie Owen & Christine Cooper November 2013 Contents Executive Summary... 3 Introduction...
More informationLearning with labeled and unlabeled data
Learning with labeled and unlabeled data page: 1 of 21 Learning with labeled and unlabeled data Author: Matthias Seeger, Institute for Adaptive Neural Computation, University of Edinburgh Presented by:
More informationCS Statistical Machine learning Lecture 18: Midterm Review. Yuan (Alan) Qi
CS 59000 Statistical Machine learning Lecture 18: Midterm Review Yuan (Alan) Qi Overview Overfitting, probabilities, decision theory, entropy and KL divergence, ML and Bayesian estimation of Gaussian and
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationChenfeng Xiong (corresponding), University of Maryland, College Park (cxiong@umd.edu)
Paper Author (s) Chenfeng Xiong (corresponding), University of Maryland, College Park (cxiong@umd.edu) Lei Zhang, University of Maryland, College Park (lei@umd.edu) Paper Title & Number Dynamic Travel
More informationLecture 10: Sequential Data Models
CSC2515 Fall 2007 Introduction to Machine Learning Lecture 10: Sequential Data Models 1 Example: sequential data Until now, considered data to be i.i.d. Turn attention to sequential data Timeseries: stock
More informationCell Phone based Activity Detection using Markov Logic Network
Cell Phone based Activity Detection using Markov Logic Network Somdeb Sarkhel sxs104721@utdallas.edu 1 Introduction Mobile devices are becoming increasingly sophisticated and the latest generation of smart
More informationMixture Models. Jia Li. Department of Statistics The Pennsylvania State University. Mixture Models
Mixture Models Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Clustering by Mixture Models General bacground on clustering Example method: means Mixture model based
More informationIntroduction to Machine Learning
Introduction to Machine Learning Third Edition Ethem Alpaydın The MIT Press Cambridge, Massachusetts London, England 2014 Massachusetts Institute of Technology All rights reserved. No part of this book
More informationCOS 513: SEQUENCE MODELS I LECTURE ON NOV 22, 2010 PREM GOPALAN
COS 513: SEQUENCE MODELS I LECTURE ON NOV 22, 2010 PREM GOPALAN 1. INTRODUCTION In this lecture we consider how to model sequential data. Rather than assuming that the data are all independent of each
More informationModelBased Cluster Analysis for Web Users Sessions
ModelBased Cluster Analysis for Web Users Sessions George Pallis, Lefteris Angelis, and Athena Vakali Department of Informatics, Aristotle University of Thessaloniki, 54124, Thessaloniki, Greece gpallis@ccf.auth.gr
More informationCSE 527 Notes Lecture 5, 10/13/04. ModelBased Clustering
ModelBased Clustering Review of Partitional Clustering, KMeans: 1. Decide # of clusters, K 2. Assign initial estimates for the center of each of K clusters 3. Assign each point to its nearest center
More informationLogistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
More informationReject Inference in Credit Scoring. JieMen Mok
Reject Inference in Credit Scoring JieMen Mok BMI paper January 2009 ii Preface In the Master programme of Business Mathematics and Informatics (BMI), it is required to perform research on a business
More informationConditional Random Fields: An Introduction
Conditional Random Fields: An Introduction Hanna M. Wallach February 24, 2004 1 Labeling Sequential Data The task of assigning label sequences to a set of observation sequences arises in many fields, including
More informationThe Probit Link Function in Generalized Linear Models for Data Mining Applications
Journal of Modern Applied Statistical Methods Copyright 2013 JMASM, Inc. May 2013, Vol. 12, No. 1, 164169 1538 9472/13/$95.00 The Probit Link Function in Generalized Linear Models for Data Mining Applications
More informationHidden Markov Models applied to Data Mining
Hidden Markov Models applied to Data Mining Andrea Marin Dipartimento di Informatica Università Ca Foscari di Venezia Via Torino 155, 30172 Venezia Mestre, Italy marin@dsi.unive.it Abstract. Final task
More informationStatistical Analysis with Missing Data
Statistical Analysis with Missing Data Second Edition RODERICK J. A. LITTLE DONALD B. RUBIN WILEY INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents Preface PARTI OVERVIEW AND BASIC APPROACHES
More informationMultivariate Statistical Modelling Based on Generalized Linear Models
Ludwig Fahrmeir Gerhard Tutz Multivariate Statistical Modelling Based on Generalized Linear Models Second Edition With contributions from Wolfgang Hennevogl With 51 Figures Springer Contents Preface to
More informationThe Naive Bayes Model, MaximumLikelihood Estimation, and the EM Algorithm
The Naive Bayes Model, MaximumLikelihood Estimation, and the EM Algorithm Michael Collins 1 Introduction This note covers the following topics: The Naive Bayes model for classification (with text classification
More informationNote on the EM Algorithm in Linear Regression Model
International Mathematical Forum 4 2009 no. 38 18831889 Note on the M Algorithm in Linear Regression Model JiXia Wang and Yu Miao College of Mathematics and Information Science Henan Normal University
More informationFuzzy Clustering of Quantitative and Qualitative Data
Fuzzy Clustering of Quantitative and Qualitative Data Christian Döring, Christian Borgelt, and Rudolf Kruse Dept. of Knowledge Processing and Language Engineering OttovonGuerickeUniversity of Magdeburg
More informationCrime Location Crime Type Month Year Betting Shop Criminal Damage April 2010 Betting Shop Theft April 2010 Betting Shop Assault April 2010
Crime Location Crime Type Month Year Betting Shop Theft April 2010 Betting Shop Assault April 2010 Betting Shop Theft April 2010 Betting Shop Theft April 2010 Betting Shop Assault April 2010 Betting Shop
More informationUNSUPERVISED LEARNING AND CLUSTERING. Jeff Robble, Brian Renzenbrink, Doug Roberts
UNSUPERVISED LEARNING AND CLUSTERING Jeff Robble, Brian Renzenbrink, Doug Roberts Unsupervised Procedures A procedure that uses unlabeled data in its classification process. Why would we use these? Collecting
More informationLatent class representation of the Grade of Membership model
Latent class representation of the Grade of Membership model By ELENA A. EROSHEVA Technical Report No. 492 Department of Statistics, University of Washington, Box 354322 Seattle, WA 981954322, U. S. A.
More informationClustering / Unsupervised Methods
Clustering / Unsupervised Methods Jason Corso, Albert Chen SUNY at Buffalo J. Corso (SUNY at Buffalo) Clustering / Unsupervised Methods 1 / 41 Clustering Introduction Until now, we ve assumed our training
More informationMIXTURE DENSITY ESTIMATION
3 MIXTURE DENSITY ESTIMATION In this chapter we consider mixture densities, the main building block for the dimension reduction techniques described in the following chapters. In the first section we introduce
More informationBayesian Mixture Models and the Gibbs Sampler
Bayesian Mixture Models and the Gibbs Sampler David M. Blei Columbia University October 19, 2015 We have discussed probabilistic modeling, and have seen how the posterior distribution is the critical quantity
More informationSupport Vector Machines with Clustering for Training with Very Large Datasets
Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France theodoros.evgeniou@insead.fr Massimiliano
More informationA Sparse Kernel Density Estimation Algorithm using Forward Constrained Regression
A Sparse Kernel Density Estimation Algorithm using Forward Constrained Regression Xia Hong, Sheng Chen, and Chris Harris School of Systems Engineering, University of Reading, Reading, RG6 6AY, UK School
More informationSAMPLE SIZE DETERMINATION USING POSTERIOR PREDICTIVE DISTRIBUTIONS
Sankhyā : The Indian Journal of Statistics Special Issue on Bayesian Analysis 1998, Volume spl, Series, Pt. 1, pp. 161175 SAMPLE SIZE DETERMINATION USING POSTERIOR PREDICTIVE DISTRIBUTIONS By DONALD B.
More informationCurriculum Vitae of Francesco Bartolucci
Curriculum Vitae of Francesco Bartolucci Department of Economics, Finance and Statistics University of Perugia Via A. Pascoli, 20 06123 Perugia (IT) email: bart@stat.unipg.it http://www.stat.unipg.it/bartolucci
More informationClassifying Galaxies using a datadriven approach
Classifying Galaxies using a datadriven approach Supervisor : Prof. David van Dyk Department of Mathematics Imperial College London London, April 2015 Outline The Classification Problem 1 The Classification
More informationPackage MixGHD. June 26, 2015
Type Package Package MixGHD June 26, 2015 Title Model Based Clustering, Classification and Discriminant Analysis Using the Mixture of Generalized Hyperbolic Distributions Version 1.7 Date 2015615 Author
More informationStructural Equation Models: Mixture Models
Structural Equation Models: Mixture Models Jeroen K. Vermunt Department of Methodology and Statistics Tilburg University Jay Magidson Statistical Innovations Inc. 1 Introduction This article discusses
More informationDensity Estimation and Mixture Models
Density Estimation and Mixture Models, Week 4 N. Schraudolph Today s Topics: Modeling: parametric vs. nonparametric models semiparametric and mixture models Density Estimation: nonparametric: histogram,
More informationCHAPTER 5 SEMISUPERVISED LEARNING WITH HIDDEN STATE VECTOR MODEL
CHAPTER 5 SEMISUPERVISED LEARNING WITH HIDDEN STATE VECTOR 65 Spoken Language Understanding has been a challenge in the design of the spoken dialogue system where the intention of the speaker has to be
More informationAnalyzing categorical panel data by means of causal loglinear models with latent variables Vermunt, Jeroen; Georg, W.
Tilburg University Analyzing categorical panel data by means of causal loglinear models with latent variables Vermunt, Jeroen; Georg, W. Document version: Publisher's PDF, also known as Version of record
More informationStatistics in Retail Finance. Chapter 6: Behavioural models
Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics: Behavioural
More informationIntroduction to mixed model and missing data issues in longitudinal studies
Introduction to mixed model and missing data issues in longitudinal studies Hélène JacqminGadda INSERM, U897, Bordeaux, France Inserm workshop, St Raphael Outline of the talk I Introduction Mixed models
More informationLinear Classification. Volker Tresp Summer 2015
Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong
More informationLikelihood Approaches for Trial Designs in Early Phase Oncology
Likelihood Approaches for Trial Designs in Early Phase Oncology Clinical Trials Elizabeth GarrettMayer, PhD Cody Chiuzan, PhD Hollings Cancer Center Department of Public Health Sciences Medical University
More informationHidden Markov Model. Jia Li. Department of Statistics The Pennsylvania State University. Hidden Markov Model
Hidden Markov Model Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Hidden Markov Model Hidden Markov models have close connection with mixture models. A mixture model
More informationCourse: Model, Learning, and Inference: Lecture 5
Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.
More informationASC 076 INTRODUCTION TO SOCIAL AND CRIMINAL PSYCHOLOGY
DIPLOMA IN CRIME MANAGEMENT AND PREVENTION COURSES DESCRIPTION ASC 075 INTRODUCTION TO SOCIOLOGY AND ANTHROPOLOGY Defining Sociology and Anthropology, Emergence of Sociology, subject matter and subdisciplines.
More informationFitting Subjectspecific Curves to Grouped Longitudinal Data
Fitting Subjectspecific Curves to Grouped Longitudinal Data Djeundje, Viani HeriotWatt University, Department of Actuarial Mathematics & Statistics Edinburgh, EH14 4AS, UK Email: vad5@hw.ac.uk Currie,
More informationGRADE PERFORMANCE IN STATISTICS: A BAYESIAN FRAMEWORK
ABSTRACT GRADE PERFORMANCE IN STATISTICS: A BAYESIAN FRAMEWORK Lawrence V. Fulton, Texas State University, San Marcos, Texas, USA Nathaniel D. Bastian, University of Maryland University College, Adelphi,
More informationMissing Data Problems in Machine Learning
Missing Data Problems in Machine Learning Senate Thesis Defense Ben Marlin Machine Learning Group Department of Computer Science University of Toronto April 8, 008 Contents: Overview Notation Theory Of
More informationSemisupervised learning
Semisupervised learning Learning from both labeled and unlabeled data Semisupervised learning Learning from both labeled and unlabeled data Motivation: labeled data may be hard/expensive to get, but
More informationIntroduction to Mixture Modeling
research methodology series Introduction to Mixture Modeling Kevin A. Kupzyk, MA Methodological Consultant, CYFS SRM Unit Originally presented on 12/11/09 by the Statistics & Research Methodology Unit
More informationPreserving Class Discriminatory Information by. Contextsensitive Intraclass Clustering Algorithm
Preserving Class Discriminatory Information by Contextsensitive Intraclass Clustering Algorithm Yingwei Yu, Ricardo GutierrezOsuna, and Yoonsuck Choe Department of Computer Science Texas A&M University
More informationApproximate Inference
Approximate Inference IPAM Summer School Ruslan Salakhutdinov BCS, MIT Deprtment of Statistics, University of Toronto 1 Plan 1. Introduction/Notation. 2. Illustrative Examples. 3. Laplace Approximation.
More information2005 JSM Presentation. Bayesian Models for Adjusting Response Bias in Survey Data: An Example in Estimating Rape and Domestic Violence from the NCVS
Bayesian Models for Adjusting Response Bias in Survey Data: An Example in Estimating Rape and Domestic Violence from the NCVS Qingzhao Yu Elizabeth A. Stasny Statistics Department The Ohio State University
More informationOptimal Hedging of Interest Rate Exposure Given Credit Correlation
Spring 11 Optimal Hedging of Interest Rate Exposure Given Credit Correlation Ray Chen, Abhay Subramanian, Xiao Tang, Michael Turrin Stanford University, MS&E 444 1 1. Introduction Interest rate risk arises
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationMessagepassing sequential detection of multiple change points in networks
Messagepassing sequential detection of multiple change points in networks Long Nguyen, Arash Amini Ram Rajagopal University of Michigan Stanford University ISIT, Boston, July 2012 Nguyen/Amini/Rajagopal
More informationMaximum Likelihood Estimation
Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for
More informationAn introduction to Hidden Markov Models
An introduction to Hidden Markov Models Christian Kohlschein Abstract Hidden Markov Models (HMM) are commonly defined as stochastic finite state machines. Formally a HMM can be described as a 5tuple Ω
More informationChristfried Webers. Canberra February June 2015
c Statistical Group and College of Engineering and Computer Science Canberra February June (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 829 c Part VIII Linear Classification 2 Logistic
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models  part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK2800 Kgs. Lyngby
More informationOn Multidimensional Markov Chain Models
On Multidimensional Markov Chain Models WaiKi Ching ShuQin Zhang Advanced Modeling and Applied Computing Laboratory Department of Mathematics The University of Hong Kong Pokfulam Road, Hong Kong Email:
More informationProbabilistic trust models in network security
UNIVERSITY OF SOUTHAMPTON Probabilistic trust models in network security by Ehab M. ElSalamouny A thesis submitted in partial fulfillment for the degree of Doctor of Philosophy in the Faculty of Engineering
More informationModelBased Recursive Partitioning for Detecting Interaction Effects in Subgroups
ModelBased Recursive Partitioning for Detecting Interaction Effects in Subgroups Achim Zeileis, Torsten Hothorn, Kurt Hornik http://eeecon.uibk.ac.at/~zeileis/ Overview Motivation: Trees, leaves, and
More informationSAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More informationF].SLR0E IVX] To make a grammar probabilistic, we need to assign a probability to each contextfree rewrite
Notes on the InsideOutside Algorithm F].SLR0E IVX] To make a grammar probabilistic, we need to assign a probability to each contextfree rewrite rule. But how should these probabilities be chosen? It
More informationMS1b Statistical Data Mining
MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to
More informationUW CSE Technical Report 030601 Probabilistic Bilinear Models for AppearanceBased Vision
UW CSE Technical Report 030601 Probabilistic Bilinear Models for AppearanceBased Vision D.B. Grimes A.P. Shon R.P.N. Rao Dept. of Computer Science and Engineering University of Washington Seattle, WA
More informationStatistical Machine Learning from Data
Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Gaussian Mixture Models Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique
More informationDetection of changes in variance using binary segmentation and optimal partitioning
Detection of changes in variance using binary segmentation and optimal partitioning Christian Rohrbeck Abstract This work explores the performance of binary segmentation and optimal partitioning in the
More information10601: Machine Learning Midterm Exam November 3, Solutions
10601: Machine Learning Midterm Exam November 3, 2010 Solutions Instructions: Make sure that your exam has 16 pages (not including this cover sheet) and is not missing any sheets, then write your full
More informationA HYBRID GENETIC ALGORITHM FOR THE MAXIMUM LIKELIHOOD ESTIMATION OF MODELS WITH MULTIPLE EQUILIBRIA: A FIRST REPORT
New Mathematics and Natural Computation Vol. 1, No. 2 (2005) 295 303 c World Scientific Publishing Company A HYBRID GENETIC ALGORITHM FOR THE MAXIMUM LIKELIHOOD ESTIMATION OF MODELS WITH MULTIPLE EQUILIBRIA:
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
More informationComparing Conditional and Marginal Direct Estimation of Subgroup Distributions
RESEARCH REPORT January 2003 RR0302 Comparing Conditional and Marginal Direct Estimation of Subgroup Distributions Matthias von Davier Research & Development Division Princeton, NJ 08541 Comparing Conditional
More informationData a systematic approach
Pattern Discovery on Australian Medical Claims Data a systematic approach Ah Chung Tsoi Senior Member, IEEE, Shu Zhang, Markus Hagenbuchner Member, IEEE Abstract The national health insurance system in
More informationMACHINE LEARNING IN HIGH ENERGY PHYSICS
MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!
More informationThe Exponential Family
The Exponential Family David M. Blei Columbia University November 3, 2015 Definition A probability density in the exponential family has this form where p.x j / D h.x/ expf > t.x/ a./g; (1) is the natural
More informationClass Notes: Week 3. proficient
Ronald Heck Class Notes: Week 3 1 Class Notes: Week 3 This week we will look a bit more into relationships between two variables using crosstabulation tables. Let s go back to the analysis of home language
More informationA crash course in probability and Naïve Bayes classification
Probability theory A crash course in probability and Naïve Bayes classification Chapter 9 Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s
More informationHidden Markov Models Fundamentals
Hidden Markov Models Fundamentals Daniel Ramage CS229 Section Notes December, 2007 Abstract How can we apply machine learning to data that is represented as a sequence of observations over time? For instance,
More informationLecture 4: More on Continuous Random Variables and Functions of Random Variables
Lecture 4: More on Continuous Random Variables and Functions of Random Variables ELE 525: Random Processes in Information Systems Hisashi Kobayashi Department of Electrical Engineering Princeton University
More information6.891 Machine learning and neural networks
6.89 Machine learning and neural networks Midterm exam: SOLUTIONS October 3, 2 (2 points) Your name and MIT ID: No Body, MIT ID # Problem. (6 points) Consider a twolayer neural network with two inputs
More information