# Lecture 11: Further Topics in Bayesian Statistical Modeling: Graphical Modelling and Model Selection with DIC

Save this PDF as:

Size: px
Start display at page:

Download "Lecture 11: Further Topics in Bayesian Statistical Modeling: Graphical Modelling and Model Selection with DIC"

## Transcription

1 Lecture 11: Further topics in Bayesian statistical modeling [1] Lecture 11: Further Topics in Bayesian Statistical Modeling: Graphical Modelling and Model Selection with DIC

2 Graphical Models Lecture 11: Further topics in Bayesian statistical modeling [2] Statistical modeling of complex systems involve usually many interconnected random variables. Question: How to build these connections? Answer: Think locally, act globally! Directed Acyclic Graphs (DAG): All quantities (random variables) in a model are represented by a node Relationships between nodes by arrows The graph is used to represent a set of conditional independence statements Express the joint relationship between all known (data) and unknown quantities (parameters, predictions, missing data, etc.) in a model through a series of simple local relationships. Provides the basis for computations

3 Conditional independence Lecture 11: Further topics in Bayesian statistical modeling [3] Two variables, X and Y are statistically independent if p(x, Y ) = p(x)p(y ). Equivalently, variables X and Y are statistically independent if Conditional independence: p(y X) = p(y ) Given three variables X, Y, and Z we say that X and Y are conditionally independent give Z, denoted by X Y Z, if p(x, Y Z) = p(x Z)p(Y Z)

4 Lecture 11: Further topics in Bayesian statistical modeling [4] Example: A Toy Model (Spiegelhalter, 1998) From a DAG, we can read of some conditional independence statements (Local Markov property) that use the natural order of the graph, e.g. B C, E, F A

5 Lecture 11: Further topics in Bayesian statistical modeling [5] How to read further conditional independence statements from a DAG? We define a Moral Graph by marrying the parents dropping arrows From this graph, different properties can be deduced and in particular the Global Markov property: any two subsets separated by a third one are conditional independent given the third. By separated, we mean that there is no path between the 2 subsets that does not go through the third one. In particular, p(v rest) = p(v neighbours of v) where by neighbours of v we mean the parents, spouse and children.

6 Moral graph Lecture 11: Further topics in Bayesian statistical modeling [6] D A, E, F (B, C) i.e. p(d rest) = p(d B, C)

7 Link between Gibbs sampling and DAG Lecture 11: Further topics in Bayesian statistical modeling [7] If we want to sample from p(a, B, C, D, F ) with a Gibbs sampler we define each marginal full conditional distribution using the conditional independence pattern of the DAG. Then we sample by iteratively sampling from (A, B, C, D, E, F ) p(a, B, C, D, E, F ) A p(a rest) = p(a) B p(b rest) = p(b A ) C p(c rest) = p(c A ) D p(d rest) = p(d B, C ) E p(e rest) = p(e A, F ) F p(f rest) = p(f ).

8 Lecture 11: Further topics in Bayesian statistical modeling [8] Summary DAG gives a non-algebraic description of the model Using a DAG is an interpretable way of specifying joint distributions through simple local terms It can be used to build hierarchical models It is used to find locally all conditional marginal distributions in a Bayesian model DAG is used to programs the kernel of the Gibbs sampler

9 WinBUGS and Graphical Models Lecture 11: Further topics in Bayesian statistical modeling [9] The WinBUGS User Manual recommends that the first step in any analysis should be the construction of a directed graphical model In Bayesian analysis both observable variables (data) and parameters are random variables. A Bayesian graphical model consists of nodes representing both data and parameters. These graphical representation can add clarity to complex patters of dependency.

10 WinBUGS implementation Lecture 11: Further topics in Bayesian statistical modeling [10] DoodleBUGS is a tool for drawing graphical models. BUGS code for a model can be generated from the graph. Types of nodes: Constants: fixed values - assigned values in data; cannot have parent nodes. Stochastic nodes: random variables assigned a probability distribution in the model - can be observed (data) or unobserved (parameters). Deterministic nodes: derived from other nodes as mathematical or logical functions of them.

11 Lecture 11: Further topics in Bayesian statistical modeling [11] Array of nodes - e.g. data values y[i]. They are represented compactly by a plate, indexed by i = 1,..., N. Type of links between nodes: Single arrows: represent stochastic dependence. Double arrows: represent logical (mathematical) dependence

12 Example: regression model Lecture 11: Further topics in Bayesian statistical modeling [12] A DAG representation for a linear regression model: y i N(µ i, τ) (i = 1,..., N) with µ i = θ 1 x 1,i + θ 2 x 1,i and τ = 1/σ 2

13 Multiple indexing Lecture 11: Further topics in Bayesian statistical modeling [13] Very useful to represents complex model structures: Each level of indexing of a variable requires its own plate in a graphical model. So an array variable like y ij would require two plates, one for each index. The y ij node will be in the intersection of the two plates. See example Dyes from WinBUGS Examples Vol. I - complete nesting. Any variable indexed by only j, for example, would be in the j plate but not in the i plate. See example Rats Vol I - repeated measures - x j (time) is the same for each i (rats), and so is in the j plate only.

14 Lecture 11: Further topics in Bayesian statistical modeling [14] Dyes from WinBUGS Examples Vol. I - complete nesting.

15 Rats Vol I - repeated measures - Lecture 11: Further topics in Bayesian statistical modeling [15]

16 More about model building Lecture 11: Further topics in Bayesian statistical modeling [16] Model criticism and sensitivity analysis Standard checks based on fitted model applied to Bayesian modeling: residuals: plot versus covariates, checks for auto-correlations and so on. prediction: check accuracy on external validation set, or cross validation. In addition should check for conflict between prior and data should check for unintended sensitivity to the prior using MCMC, we can replicate parameters and data.

17 Bayesian Model Selection Lecture 11: Further topics in Bayesian statistical modeling [17] Classical model selection criteria like C p, AIC and BIC assumed that the number of parameters in the model is a well-defined concepts. It is taken to be equivalent to degrees of freedom or the number of free parameters. In Bayesian analysis the prior effectively acts to restrict the freedom of these parameters to some extent and thus the appropriate model degrees of freedom is less clear. Another issue in complex models (i.e. hierarchical models) is that the likelihood is not a well defined concept. Moreover models to compare are not nested.

18 Using DIC for model selection Lecture 11: Further topics in Bayesian statistical modeling [18] Spiegelhater et al (2002) proposed a Bayesian model comparison criterion based on trading off goodness of fit and model complexity: Deviance Information Criterion, DIC = goodness of fit + complexity They measure goodness of fit via the deviance: D(θ) = 2 log L(data θ) Complexity of the model via: p D = E θ y [D] D ( E θ y [θ] ) = D D( θ)

19 Lecture 11: Further topics in Bayesian statistical modeling [19] i.e. posterior mean deviance minus deviance evaluated at the posterior mean of the parameters. The DIC is defined similarly to AIC as DIC = D( θ) + 2 p D = D + p D Models with smaller DIC are better supported by the data DIC can be monitored in WinBUGS from Interface/DIC menu.

20 Lecture 11: Further topics in Bayesian statistical modeling [20] Example: Gelman et. al pag 182 Suppose that the data model is y µ N(µ, 1) with prior µ Unif(0, 1000). Now suppose that we observe y 1 = 0.5 and y 2 = 100. Which is the effective number of parameters p D in each case: model{ y1 ~ dnorm(mu1, 1) y2 ~ dnorm(mu2, 1) mu1 ~ dunif(0,1000) mu2 ~ dunif(0, 1000) } #data list(y1 = 0.5, y2= 100)

21 Lecture 11: Further topics in Bayesian statistical modeling [21] Then we have Dbar Dhat pd DIC y y If we observe y 1 = 0.5 then effective number of parameters p D is approximately 0.5, since roughly half the information in the posterior distribution is coming from the data and half from the prior constraint of positivity. If we observe y 2 = 100 then the constrain is essentially irrelevant and the effective number of parameters is approximately 1.

22 Lecture 11: Further topics in Bayesian statistical modeling [22] Some comments p D is not invariant to reparametrization, i.e. which estimate is used in D( θ) p D can be negative if there is a strong prior-data conflict DIC and p D are particular useful in hierarchical models p D depends on the model and on the data. This is fundamentally different to AIC or BIC

### Lecture 2: Introduction to belief (Bayesian) networks

Lecture 2: Introduction to belief (Bayesian) networks Conditional independence What is a belief network? Independence maps (I-maps) January 7, 2008 1 COMP-526 Lecture 2 Recall from last time: Conditional

### Model-based Synthesis. Tony O Hagan

Model-based Synthesis Tony O Hagan Stochastic models Synthesising evidence through a statistical model 2 Evidence Synthesis (Session 3), Helsinki, 28/10/11 Graphical modelling The kinds of models that

### Lab 8: Introduction to WinBUGS

40.656 Lab 8 008 Lab 8: Introduction to WinBUGS Goals:. Introduce the concepts of Bayesian data analysis.. Learn the basic syntax of WinBUGS. 3. Learn the basics of using WinBUGS in a simple example. Next

### The Joint Probability Distribution (JPD) of a set of n binary variables involve a huge number of parameters

DEFINING PROILISTI MODELS The Joint Probability Distribution (JPD) of a set of n binary variables involve a huge number of parameters 2 n (larger than 10 25 for only 100 variables). x y z p(x, y, z) 0

### Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach

Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach Refik Soyer * Department of Management Science The George Washington University M. Murat Tarimcilar Department of Management Science

### 13.3 Inference Using Full Joint Distribution

191 The probability distribution on a single variable must sum to 1 It is also true that any joint probability distribution on any set of variables must sum to 1 Recall that any proposition a is equivalent

### Probabilistic Graphical Models

Probabilistic Graphical Models Raquel Urtasun and Tamir Hazan TTI Chicago April 4, 2011 Raquel Urtasun and Tamir Hazan (TTI-C) Graphical Models April 4, 2011 1 / 22 Bayesian Networks and independences

### Pooling and Meta-analysis. Tony O Hagan

Pooling and Meta-analysis Tony O Hagan Pooling Synthesising prior information from several experts 2 Multiple experts The case of multiple experts is important When elicitation is used to provide expert

### Penalized regression: Introduction

Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood

### Robert Piché Tampere University of Technology

model fit: mu 35.0 30.0 25.0 20.0 15.0 180.0 190.0 200.0 210.0 Statistical modelling with WinBUGS Robert Piché Tampere University of Technology diff sample: 2000 6.0 4.0 2.0 0.0 1 y/n -0.75-0.5-0.25 0.0

### CS 188: Artificial Intelligence. Probability recap

CS 188: Artificial Intelligence Bayes Nets Representation and Independence Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew Moore Conditional probability

### Bayesian Approaches to Handling Missing Data

Bayesian Approaches to Handling Missing Data Nicky Best and Alexina Mason BIAS Short Course, Jan 30, 2012 Lecture 1. Introduction to Missing Data Bayesian Missing Data Course (Lecture 1) Introduction to

### CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS

Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships

### Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber 2011 1

Data Modeling & Analysis Techniques Probability & Statistics Manfred Huber 2011 1 Probability and Statistics Probability and statistics are often used interchangeably but are different, related fields

### Jointly Distributed Random Variables

Jointly Distributed Random Variables COMP 245 STATISTICS Dr N A Heard Contents 1 Jointly Distributed Random Variables 1 1.1 Definition......................................... 1 1.2 Joint cdfs..........................................

### Using SAS PROC MCMC to Estimate and Evaluate Item Response Theory Models

Using SAS PROC MCMC to Estimate and Evaluate Item Response Theory Models Clement A Stone Abstract Interest in estimating item response theory (IRT) models using Bayesian methods has grown tremendously

### L10: Probability, statistics, and estimation theory

L10: Probability, statistics, and estimation theory Review of probability theory Bayes theorem Statistics and the Normal distribution Least Squares Error estimation Maximum Likelihood estimation Bayesian

### 5 Directed acyclic graphs

5 Directed acyclic graphs (5.1) Introduction In many statistical studies we have prior knowledge about a temporal or causal ordering of the variables. In this chapter we will use directed graphs to incorporate

### The Basics of Graphical Models

The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

### A Latent Variable Approach to Validate Credit Rating Systems using R

A Latent Variable Approach to Validate Credit Rating Systems using R Chicago, April 24, 2009 Bettina Grün a, Paul Hofmarcher a, Kurt Hornik a, Christoph Leitner a, Stefan Pichler a a WU Wien Grün/Hofmarcher/Hornik/Leitner/Pichler

### Gaussian Classifiers CS498

Gaussian Classifiers CS498 Today s lecture The Gaussian Gaussian classifiers A slightly more sophisticated classifier Nearest Neighbors We can classify with nearest neighbors x m 1 m 2 Decision boundary

### Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com

Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University caizhua@gmail.com 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian

### STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

### Analyzing Clinical Trial Data via the Bayesian Multiple Logistic Random Effects Model

Analyzing Clinical Trial Data via the Bayesian Multiple Logistic Random Effects Model Bartolucci, A.A 1, Singh, K.P 2 and Bae, S.J 2 1 Dept. of Biostatistics, University of Alabama at Birmingham, Birmingham,

### Hierarchical Bayes Small Area Estimates of Adult Literacy Using Unmatched Sampling and Linking Models

Hierarchical Bayes Small Area Estimates of Adult Literacy Using Unmatched Sampling and Linking Models Leyla Mohadjer 1, J.N.K. Rao, Benmei Liu 1, Tom Krenzke 1, and Wendy Van de Kerckhove 1 Leyla Mohadjer,

### Question 2 Naïve Bayes (16 points)

Question 2 Naïve Bayes (16 points) About 2/3 of your email is spam so you downloaded an open source spam filter based on word occurrences that uses the Naive Bayes classifier. Assume you collected the

### Bayesian Methods. 1 The Joint Posterior Distribution

Bayesian Methods Every variable in a linear model is a random variable derived from a distribution function. A fixed factor becomes a random variable with possibly a uniform distribution going from a lower

Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

### Linear Threshold Units

Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

### TIME VALUE OF MONEY PROBLEM #8: NET PRESENT VALUE Professor Peter Harris Mathematics by Sharon Petrushka

TIME VALUE OF MONEY PROBLEM #8: NET PRESENT VALUE Professor Peter Harris Mathematics by Sharon Petrushka Introduction Creativity Unlimited Corporation is contemplating buying a machine for \$100,000, which

### 5. Multiple regression

5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful

### Estimating the evidence for statistical models

Estimating the evidence for statistical models Nial Friel University College Dublin nial.friel@ucd.ie March, 2011 Introduction Bayesian model choice Given data y and competing models: m 1,..., m l, each

### PS 271B: Quantitative Methods II. Lecture Notes

PS 271B: Quantitative Methods II Lecture Notes Langche Zeng zeng@ucsd.edu The Empirical Research Process; Fundamental Methodological Issues 2 Theory; Data; Models/model selection; Estimation; Inference.

### Applications of R Software in Bayesian Data Analysis

Article International Journal of Information Science and System, 2012, 1(1): 7-23 International Journal of Information Science and System Journal homepage: www.modernscientificpress.com/journals/ijinfosci.aspx

### Probability Theory. Elementary rules of probability Sum rule. Product rule. p. 23

Probability Theory Uncertainty is key concept in machine learning. Probability provides consistent framework for the quantification and manipulation of uncertainty. Probability of an event is the fraction

### WinBUGS User Manual. Imperial College School of Medicine, Norfolk Place, London W2 1PG, UK

WinBUGS User Manual Version 1.4, January 2003 David Spiegelhalter 1 Andrew Thomas 2 Nicky Best 2 Dave Lunn 2 1 MRC Biostatistics Unit, Institute of Public Health, Robinson Way, Cambridge CB2 2SR, UK 2

### I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Algebra Slide 1 of

### Monitoring the Behaviour of Credit Card Holders with Graphical Chain Models

Journal of Business Finance & Accounting, 30(9) & (10), Nov./Dec. 2003, 0306-686X Monitoring the Behaviour of Credit Card Holders with Graphical Chain Models ELENA STANGHELLINI* 1. INTRODUCTION Consumer

### Statistical Machine Learning

Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

### The Delta Method and Applications

Chapter 5 The Delta Method and Applications 5.1 Linear approximations of functions In the simplest form of the central limit theorem, Theorem 4.18, we consider a sequence X 1, X,... of independent and

### Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data

### Bivariate Distributions

Chapter 4 Bivariate Distributions 4.1 Distributions of Two Random Variables In many practical cases it is desirable to take more than one measurement of a random observation: (brief examples) 1. What is

### Introducing the Multilevel Model for Change

Department of Psychology and Human Development Vanderbilt University GCM, 2010 1 Multilevel Modeling - A Brief Introduction 2 3 4 5 Introduction In this lecture, we introduce the multilevel model for change.

### Bayesian Statistics: Indian Buffet Process

Bayesian Statistics: Indian Buffet Process Ilker Yildirim Department of Brain and Cognitive Sciences University of Rochester Rochester, NY 14627 August 2012 Reference: Most of the material in this note

### Statistical Models in R

Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova

### PREDICTIVE DISTRIBUTIONS OF OUTSTANDING LIABILITIES IN GENERAL INSURANCE

PREDICTIVE DISTRIBUTIONS OF OUTSTANDING LIABILITIES IN GENERAL INSURANCE BY P.D. ENGLAND AND R.J. VERRALL ABSTRACT This paper extends the methods introduced in England & Verrall (00), and shows how predictive

### Up/Down Analysis of Stock Index by Using Bayesian Network

Engineering Management Research; Vol. 1, No. 2; 2012 ISSN 1927-7318 E-ISSN 1927-7326 Published by Canadian Center of Science and Education Up/Down Analysis of Stock Index by Using Bayesian Network Yi Zuo

### The Graphical Method: An Example

The Graphical Method: An Example Consider the following linear program: Maximize 4x 1 +3x 2 Subject to: 2x 1 +3x 2 6 (1) 3x 1 +2x 2 3 (2) 2x 2 5 (3) 2x 1 +x 2 4 (4) x 1, x 2 0, where, for ease of reference,

### Forecast covariances in the linear multiregression dynamic model.

Forecast covariances in the linear multiregression dynamic model. Catriona M Queen, Ben J Wright and Casper J Albers The Open University, Milton Keynes, MK7 6AA, UK February 28, 2007 Abstract The linear

### Querying Joint Probability Distributions

Querying Joint Probability Distributions Sargur Srihari srihari@cedar.buffalo.edu 1 Queries of Interest Probabilistic Graphical Models (BNs and MNs) represent joint probability distributions over multiple

### Lecture 4 Linear random coefficients models

Lecture 4 Linear random coefficients models Rats example 30 young rats, weights measured weekly for five weeks Dependent variable (Y ij ) is weight for rat i at week j Data: Multilevel: weights (observations)

### 4. Introduction to Statistics

Statistics for Engineers 4-1 4. Introduction to Statistics Descriptive Statistics Types of data A variate or random variable is a quantity or attribute whose value may vary from one unit of investigation

### Joint models for classification and comparison of mortality in different countries.

Joint models for classification and comparison of mortality in different countries. Viani D. Biatat 1 and Iain D. Currie 1 1 Department of Actuarial Mathematics and Statistics, and the Maxwell Institute

### Bayesian Statistics in One Hour. Patrick Lam

Bayesian Statistics in One Hour Patrick Lam Outline Introduction Bayesian Models Applications Missing Data Hierarchical Models Outline Introduction Bayesian Models Applications Missing Data Hierarchical

### Lecture 4: BK inequality 27th August and 6th September, 2007

CSL866: Percolation and Random Graphs IIT Delhi Amitabha Bagchi Scribe: Arindam Pal Lecture 4: BK inequality 27th August and 6th September, 2007 4. Preliminaries The FKG inequality allows us to lower bound

### Journal of Statistical Software

JSS Journal of Statistical Software October 2014, Volume 61, Issue 7. http://www.jstatsoft.org/ WebBUGS: Conducting Bayesian Statistical Analysis Online Zhiyong Zhang University of Notre Dame Abstract

### Bayesian Multiple Imputation of Zero Inflated Count Data

Bayesian Multiple Imputation of Zero Inflated Count Data Chin-Fang Weng chin.fang.weng@census.gov U.S. Census Bureau, 4600 Silver Hill Road, Washington, D.C. 20233-1912 Abstract In government survey applications,

### Module 3: Correlation and Covariance

Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

### Validation of Software for Bayesian Models Using Posterior Quantiles

Validation of Software for Bayesian Models Using Posterior Quantiles Samantha R. COOK, Andrew GELMAN, and Donald B. RUBIN This article presents a simulation-based method designed to establish the computational

### Markov Chain Monte Carlo Simulation Made Simple

Markov Chain Monte Carlo Simulation Made Simple Alastair Smith Department of Politics New York University April2,2003 1 Markov Chain Monte Carlo (MCMC) simualtion is a powerful technique to perform numerical

### Lecture 16 : Relations and Functions DRAFT

CS/Math 240: Introduction to Discrete Mathematics 3/29/2011 Lecture 16 : Relations and Functions Instructor: Dieter van Melkebeek Scribe: Dalibor Zelený DRAFT In Lecture 3, we described a correspondence

### Basic Bayesian Methods

6 Basic Bayesian Methods Mark E. Glickman and David A. van Dyk Summary In this chapter, we introduce the basics of Bayesian data analysis. The key ingredients to a Bayesian analysis are the likelihood

### Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

### Probability, Conditional Independence

Probability, Conditional Independence June 19, 2012 Probability, Conditional Independence Probability Sample space Ω of events Each event ω Ω has an associated measure Probability of the event P(ω) Axioms

: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

### Course: Model, Learning, and Inference: Lecture 5

Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.

### Longitudinal Invariance CFA (using ML) Example in Mplus v (N = 151; 6 indicators over 3 occasions)

Longitudinal Invariance CFA (using ML) Example in Mplus v. 7.11 (N = 151; 6 indicators over 3 occasions) These data measuring a latent trait of social functioning were collected at a Psychiatric Rehabilitation

### The Exponential Family

The Exponential Family David M. Blei Columbia University November 3, 2015 Definition A probability density in the exponential family has this form where p.x j / D h.x/ expf > t.x/ a./g; (1) is the natural

### Welcome to Stochastic Processes 1. Welcome to Aalborg University No. 1 of 31

Welcome to Stochastic Processes 1 Welcome to Aalborg University No. 1 of 31 Welcome to Aalborg University No. 2 of 31 Course Plan Part 1: Probability concepts, random variables and random processes Lecturer:

### Simple Marginally Noninformative Prior Distributions for Covariance Matrices

Bayesian Analysis (013) 8, Number, pp. 439 45 Simple Marginally Noninformative Prior Distributions for Covariance Matrices Alan Huang * and M. P. Wand Abstract. A family of prior distributions for covariance

### Bayesian Analysis of Comparative Survey Data

Bayesian Analysis of Comparative Survey Data Bruce Western 1 Filiz Garip Princeton University April 2005 1 Department of Sociology, Princeton University, Princeton NJ 08544. We thank Sara Curran for making

### Bayesian Networks. Read R&N Ch. 14.1-14.2. Next lecture: Read R&N 18.1-18.4

Bayesian Networks Read R&N Ch. 14.1-14.2 Next lecture: Read R&N 18.1-18.4 You will be expected to know Basic concepts and vocabulary of Bayesian networks. Nodes represent random variables. Directed arcs

### Measuring the tracking error of exchange traded funds: an unobserved components approach

Measuring the tracking error of exchange traded funds: an unobserved components approach Giuliano De Rossi Quantitative analyst +44 20 7568 3072 UBS Investment Research June 2012 Analyst Certification

### Message-passing sequential detection of multiple change points in networks

Message-passing sequential detection of multiple change points in networks Long Nguyen, Arash Amini Ram Rajagopal University of Michigan Stanford University ISIT, Boston, July 2012 Nguyen/Amini/Rajagopal

### A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September

### Stat260: Bayesian Modeling and Inference Lecture Date: February 1, Lecture 3

Stat26: Bayesian Modeling and Inference Lecture Date: February 1, 21 Lecture 3 Lecturer: Michael I. Jordan Scribe: Joshua G. Schraiber 1 Decision theory Recall that decision theory provides a quantification

### Model Calibration with Open Source Software: R and Friends. Dr. Heiko Frings Mathematical Risk Consulting

Model with Open Source Software: and Friends Dr. Heiko Frings Mathematical isk Consulting Bern, 01.09.2011 Agenda in a Friends Model with & Friends o o o Overview First instance: An Extreme Value Example

Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

### Multivariate Normal Distribution

Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues

Data Mining: An Overview David Madigan http://www.stat.columbia.edu/~madigan Overview Brief Introduction to Data Mining Data Mining Algorithms Specific Eamples Algorithms: Disease Clusters Algorithms:

### Bayesian Hidden Markov Models for Alcoholism Treatment Tria

Bayesian Hidden Markov Models for Alcoholism Treatment Trial Data May 12, 2008 Co-Authors Dylan Small, Statistics Department, UPenn Kevin Lynch, Treatment Research Center, Upenn Steve Maisto, Psychology

### 5 Systems of Equations

Systems of Equations Concepts: Solutions to Systems of Equations-Graphically and Algebraically Solving Systems - Substitution Method Solving Systems - Elimination Method Using -Dimensional Graphs to Approximate

### R2MLwiN Using the multilevel modelling software package MLwiN from R

Using the multilevel modelling software package MLwiN from R Richard Parker Zhengzheng Zhang Chris Charlton George Leckie Bill Browne Centre for Multilevel Modelling (CMM) University of Bristol Using the

### Validation of Software for Bayesian Models using Posterior Quantiles. Samantha R. Cook Andrew Gelman Donald B. Rubin DRAFT

Validation of Software for Bayesian Models using Posterior Quantiles Samantha R. Cook Andrew Gelman Donald B. Rubin DRAFT Abstract We present a simulation-based method designed to establish that software

### Logistic regression: Model selection

Logistic regression: April 14 The WCGS data Measures of predictive power Today we will look at issues of model selection and measuring the predictive power of a model in logistic regression Our data set

### Model selection in R featuring the lasso. Chris Franck LISA Short Course March 26, 2013

Model selection in R featuring the lasso Chris Franck LISA Short Course March 26, 2013 Goals Overview of LISA Classic data example: prostate data (Stamey et. al) Brief review of regression and model selection.

### princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming Lecturer: Sanjeev Arora

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming Lecturer: Sanjeev Arora Scribe: One of the running themes in this course is the notion of

### MODELLING AND ANALYSIS OF

MODELLING AND ANALYSIS OF FOREST FIRE IN PORTUGAL - PART I Giovani L. Silva CEAUL & DMIST - Universidade Técnica de Lisboa gsilva@math.ist.utl.pt Maria Inês Dias & Manuela Oliveira CIMA & DM - Universidade

### Bayesian modeling of inseparable space-time variation in disease risk

Bayesian modeling of inseparable space-time variation in disease risk Leonhard Knorr-Held Laina Mercer Department of Statistics UW May 23, 2013 Motivation Area and time-specific disease rates Area and

### Solving simultaneous equations using the inverse matrix

Solving simultaneous equations using the inverse matrix 8.2 Introduction The power of matrix algebra is seen in the representation of a system of simultaneous linear equations as a matrix equation. Matrix

### Life of A Knowledge Base (KB)

Life of A Knowledge Base (KB) A knowledge base system is a special kind of database management system to for knowledge base management. KB extraction: knowledge extraction using statistical models in NLP/ML

### Estimating Industry Multiples

Estimating Industry Multiples Malcolm Baker * Harvard University Richard S. Ruback Harvard University First Draft: May 1999 Rev. June 11, 1999 Abstract We analyze industry multiples for the S&P 500 in

### Sampling Distribution of a Normal Variable

Ismor Fischer, 5/9/01 5.-1 5. Formal Statement and Examples Comments: Sampling Distribution of a Normal Variable Given a random variable. Suppose that the population distribution of is known to be normal,

### Factor Analysis. Chapter 420. Introduction

Chapter 420 Introduction (FA) is an exploratory technique applied to a set of observed variables that seeks to find underlying factors (subsets of variables) from which the observed variables were generated.

### Bayesian Networks Chapter 14. Mausam (Slides by UW-AI faculty & David Page)

Bayesian Networks Chapter 14 Mausam (Slides by UW-AI faculty & David Page) Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation

### Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

### A Bayesian Antidote Against Strategy Sprawl

A Bayesian Antidote Against Strategy Sprawl Benjamin Scheibehenne (benjamin.scheibehenne@unibas.ch) University of Basel, Missionsstrasse 62a 4055 Basel, Switzerland & Jörg Rieskamp (joerg.rieskamp@unibas.ch)