# Nuno Vasconcelos UCSD

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Bayesan parameter estmaton Nuno Vasconcelos UCSD 1

2 Maxmum lkelhood parameter estmaton n three steps: 1 choose a parametrc model for probabltes to make ths clear we denote the vector of parameters by Θ P X ( x; Θ note that ths means that Θ s NOT a random varable 2 assemble D = {x 1,..., x n } of examples drawn ndependently 3 select the parameters that maxmze the probablty of the data Θ * = arg max Θ P X = arg max log P Θ ( D; Θ P X ( D; Θ P X (D;Θ s the lkelhood of parameter Θ wth respect to the data 2

3 Least squares there are nterestng connectons between ML estmaton and least squares methods e.g. n a regresson problem we have two random varables X and Y a dataset of examples D = {(x 1,y 1, (x n,y n } a parametrc model of the form y = f (x; Θ + ε where Θ s a parameter vector, and ε a random varable that accounts for nose e.g. ε ~ N(0,σ 2 3

4 Least squares assumng that the famly of models s known, e.g. K f ( x ; Θ = f = 0 x ths s really just a problem of parameter estmaton where the data s dstrbuted as P Z X ( 2 z, f ( x ; Θ X ( D x ; Θ = G f, σ note that X s always known, and the mean s a functon of x and Θ n the homework, you wll show that Θ * = [ T 1 T Γ Γ] Γ y 4

5 Least squares where Γ = 1 K 1 K x1 M K K x n concluson: least squares estmaton s really just ML estmaton under the assumpton of Gaussan nose ndependent d sample ε ~ N(0,σ 2 once agan, probablty blt makes the assumptons explct t 5

6 Least squares soluton due to the connecton to parameter estmaton we can also talk about the qualty of the least squares soluton n partcular, we know that t s unbased varance goes to zero as the number of ponts ncreases t s the BLUE estmator for f(x;θ under the statstcal formulaton we can also see how the optmal estmator changes wth assumptons ML estmaton can also lead to (homework weghted least squares mnmzaton of L p norms robust estmators 6

7 Bayesan parameter estmaton Bayesan parameter estmaton s an alternatve framework for parameter estmaton t turns out that the dvson between Bayesan and ML methods s qute fundamental t stems from a dfferent way of nterpretng probabltes frequentst vs Bayesan there s a long debate about whch s best ths debate goes to the core of what probabltes blt mean to understand t, we have to dstngush two components the defnton of probablty (ths does not change the assessment of probablty (ths changes let s start wth a bref revew of the part that does not change 7

8 Probablty probablty s a language to deal wth processes that are non-determnstc examples: f I flp a con 100 tmes, how many can I expect to see heads? what s the weather gong to be lke tomorrow? are my stocks gong to be up or down? am I n front of a classroom or s ths just a pcture of t? 8

9 Sample space the most mportant concept s that of a sample space our process defnes a set of events these are the outcomes or states of the process example: we roll a par of dce call the value on the up face at the n th toss x n note that possble events such as odd number on second throw two sxes x 1 = 2 and x 2 = 6 can all be expressed as combnatons x 2 6 of the sample space events x 1 9

10 Sample space s the lst of possble events that satsfes the followng propertes: fnest gran: all possble dstngushable events are lsted separately mutually exclusve: f one event happens the other does not (f x 1 = 5 t cannot be anythng else collectvely exhaustve: any possble outcome can be expressed as unons of sample space events x x 1 mutually exclusve property smplfes the calculaton of the probablty of complex events collectvely exhaustve means that there s no possble outcome to whch h we cannot assgn a probablty blt 10

11 Probablty measure probablty of an event: number expressng the chance that the event wll be the outcome of the process probablty measure: satsfes three axoms P(A 0 for any event A P(unversal event = 1 f A B =, then P(A+B = P(A + P(B all of ths has to do wth the defnton of probablty 1 s the same under Bayes and frequentst vews what changes s how probabltes are assessed x x 1 11

12 Frequentst vew under the frequentst vew probabltes are relatve frequences I throw my dce n tmes n m of those the sum s 5 I say that P ( sum = 5 = m n ths s ntmately connected wth the ML method t s the ML estmate for the probablty of a Bernoull process wth states ( 5, everythng else makes sense when we have a lot of observatons no bas; decreasng varance; converges to true probablty blt 12

13 Problems many nstances where we do not have a large number of observatons consder the problem of crossng a street ths s a decson problem wth two states Y = 0: I am gong to get hurt Y = 1: I wll make t safely optmal decson computable by Bayes decson rule collect some measurements that are nformatve e.g. (X = {sze, dstance, speed} of ncomng cars collect examples under both states and estmate all probabltes somehow ths does not sound lke a great dea! 13

14 Problems under the frequentst vew you need to repeat an experment a large number of tmes to estmate any probabltes yet, people are very good at estmatng probabltes for problems n whch t s mpossble to set up such experments for example: wll I de f I jon the army? wll Democrats or Republcans wn the next electon? s there a God? wll I graduate n two years? to the pont where they make lfe-changng decsons based on these probablty estmates (enlstng n the army, etc. 14

15 Subjectve probablty ths motvates an alternatve defnton of probabltes note that ths has to do more wth how probabltes are assessed than wth the probablty defnton tself we stll have a sample space, a probablty measure, etc however the probabltes are not equated to relatve counts ths s usually referred to as subjectve probablty probabltes are degrees of belef on the outcomes of the experment they are ndvdual (vary from person to person they are not ratos of expermental outcomes e.g. for very relgous person P(god exsts ~ 1 for casual churchgoer P(god exsts ~ 0.8 (e.g. accepts evoluton, etc. for non-relgous P(god exsts ~ 0 15

16 Problems n practce, why do we care about ths? under the noton of subjectve probablty, the entre ML framework makes lttle sense there s a magc number that s estmated from the world and determnes our belefs to evaluate my estmates I have to run experments over and over agan and measure quanttes lke bas and varance ths s not how people behave, when we make estmates we attach a degree of confdence to them, wthout further experments there s only one model (the ML model for the probablty of the data, no multple explanatons there s no way to specfy that some models are, a pror, better than others 16

17 Bayesan parameter estmaton the man dfference wth respect to ML s that n the Bayesan case Θ s a random varable basc concepts tranng set D = {x 1,..., x n } of examples drawn ndependently probablty densty for observatons gven parameter P X Θ ( x pror dstrbuton b t for parameter confguratons P Θ ( that encodes pror belefs about them goal: to compute the posteror dstrbuton PΘ X D ( D 17

18 Bayes vs ML there are a number of sgnfcant dfferences between Bayesan and ML estmates D 1 : ML produces a number, the best estmate to measure ts goodness we need to measure bas and varance ths can only be done wth repeated experments Bayes produces a complete characterzaton of the parameter from the sngle dataset n addton to the most probable estmate, we obtan a characterzaton of the uncertanty lower uncertanty hgher uncertanty 18

19 Bayes vs ML D 2 : optmal estmate under ML there s one best estmate under Bayes there s no best estmate only a random varable that takes dfferent values wth dfferent probabltes techncally speakng, t makes no sense to talk about the best estmate D 3 : predctons remember that we do not really care about the parameters themselves they are needed only n the sense that they allow us to buld models that can be used to make predctons (e.g. the BDR unlke ML, Bayes uses ALL nformaton n the tranng set to make predctons 19

20 Bayes vs ML let s consder the BDR under the 0-1 loss and an ndependent sample D = {x 1,..., x n } ML-BDR: pck f two steps: fnd * * * ( x = arg max P ( x ; * where plug nto the BDR X Y = arg max P X Y P ( Y ( D, all nformaton not captured by * s lost, not used at decson tme 20

21 Bayes vs ML note that we know that nformaton s lost e.g. we can t even know how good of an estmate * s unless we run multple experments and measure bas/varance Bayesan BDR under the Bayesan framework, everythng s condtoned on the tranng data denote T = {X 1,..., X n } the set of random varables from whch the tranng sample D = {x 1,..., x n n} s drawn B-BDR: pck f * ( x = arg max PX Y, ( x, D P ( the decson s condtoned d on the entre tranng set T Y 21

22 Bayesan BDR to compute the condtonal probabltes, we use the margnalzaton equaton P X Y, T ( x, D ( ( PX Θ, Y, T x,, D PΘ Y, T, D = d note 1: when the parameter value s known, x no longer depends on T, e.g. XΘ ~ N(,σ 2 we can, smplfy equaton above nto P ( x, D ( ( PX Θ, Y x, PΘ Y, T D = d X Y, T, note 2: once agan can be done n two steps (per class fnd P ΘT (D compute P XY,T (x, D and plug nto the BDR no tranng nformaton s lost 22

23 Bayesan BDR n summary pck f * note: ( x = arg max PX Y, where P T ( x, D P Y ( ( x, D P ( x, P ( D d X Y, T X Y, Θ Θ Y, T, = as before the bottom equaton s repeated for each class hence, we can drop the dependence on the class and consder the more general problem of estmatng P ( x D P ( x P ( D d X T X Θ Θ T = 23

24 The predctve dstrbuton the dstrbuton ( x D P ( x P ( D d P = X T X Θ Θ T s known as the predctve dstrbuton ths follows from the fact that t allows us to predct the value of x gven ALL the nformaton avalable n the tranng set note that t t can also be wrtten as P ( x D E P ( x [ T D] X T = Θ T X Θ = snce each parameter value defnes a model ths s an expectaton over all possble models each model s weghted by ts posteror probablty, gven tranng data 24

25 The predctve dstrbuton suppose that 2 P ( x ~ N(,1 and P ( D ~ N( µ σ X Θ Θ T, P T ( D π P X T 1 ( x D weght π 2 Θ weght π 1 weght π 2 π σ µ 2 µ µ 1 µ 2 µ µ 1 the predctve dstrbuton s an average of all these Gaussans P ( x D P ( x P ( D d X T X Θ Θ T = 1 1 x 25

26 The predctve dstrbuton Bayes vs ML ML: pck one model Bayes: average all models are Bayesan predctons very dfferent than those of ML? they can be, unless the pror s narrow P T ( D Θ P T ( D Θ max max Bayes ~ ML very dfferent 26

27 The predctve dstrbuton hence, ML can be seen as a specal case of Bayes when you are very confdent about the model pckng one s good enough n comng lectures we wll see that f the sample s qute large, the pror tends to be narrow ntutve: gven a lot of tranng data, there s lttle uncertanty about what the model s Bayes can make a dfference when there s lttle data we have already seen that ths s the mportant case snce the varance of ML tends to go down as the sample ncreases overall Bayes regularzes the ML estmate when ths s uncertan converges to ML when there s a lot of certanty 27

28 MAP approxmaton ths sounds good, why use ML at all? the man problem wth Bayes s that the ntegral P can be qute nasty ( x D P ( x P ( D d = X T X Θ Θ T n practce one s frequently forced to use approxmatons one possblty s to do somethng smlar to ML,.e. pck only one model ths can be made to account for the pror by pckng the model that has the largest posteror probablty gven the tranng data ( D MAP = arg max P Θ T 28

29 MAP approxmaton ths can usually be computed snce arg max P ( D MAP P Θ T = D T Θ ( D ( = arg max P P and corresponds to approxmatng the pror by a delta functon centered at ts maxmum Θ ( D PΘ T ( D P T Θ MAP MAP 29

30 MAP approxmaton n ths case P X T the BDR becomes pck f * ( x D = PX Θ ( x δ ( MAP d d = P ( x X Θ ( x = arg max PX Y MAP ( MAP x ; ( ( D, P ( MAP where = arg max PT Y, Θ Θ Y P Y when compared to the ML ths has the advantage of stll accountng for the pror (although only approxmately 30

31 MAP vs ML ML-BDR pck f * * ( x = arg max P ( x ; where Bayes MAP-BDR pck f * ( x where * = MAP X Y arg max = arg max P X Y P X Y = arg max P P ( Y ( D, ( MAP x ; P ( T Y, Θ Y ( D, P ( the dfference s non-neglgble only when the dataset s small there are better alternatve approxmatons Θ Y 31

32 The Laplace approxmaton ths s a method for approxmatng any dstrbuton P X (x conssts of approxmatng P X (x by a Gaussan centered at ts peak let s assume that 1 Z ( x g( x P X = where g(x s an unormalzed dstrbuton (g(x > 0, for all x and Z the normalzaton constant Z = g ( x dx we make a Taylor seres approxmaton of g(x at ts maxmum x 0 32

33 Laplace approxmaton the Taylor expanson s log g( x = log g( x c ( o x x K (the frst-order term s zero because x 0 s a maxmum wth 2 c = x 2 log g( x x= x 0 x 0 P X (x and we approxmate g(x by an unormalzed Gaussan { ( 2 } c x x g' ( x = g( xo exp 2 and then compute the normalzaton constant 0 Z = g( x o 2π c 33

34 Laplace approxmaton ths can obvously be extended to the multvarate case the approxmaton s T log g( x = log g( xo ( x x ( 2 0 A x x0 wth A the Hessan of g(x at x 0 A j = 2 x x j log g( x and the normalzaton constant Z = g( x o ( 2 d 2π A 1 x= x 0 n physcs ths s also called a saddle-pont approxmaton 34

35 Laplace approxmaton note that the approxmaton can be made for the predctve dstrbuton ( x D = G( x, x Α P X T *, X T or for the parameter posteror n whch case ( D G(, A P Θ T = MAP, Θ T P ( x D P ( x G(, A d X T X Θ MAP, Θ T = ths s clearly superor to the MAP approxmaton ( x D = P Θ ( x δ ( d P X T X Θ MAP 35

36 Other methods there are two other man alternatves, when ths s not enough varatonal approxmatons samplng methods (Markov Chan Monte Carlo varatonal approxmatons consst of boundng the ntractable functon searchng for the best bound samplng methods consst desgnng a Markov chan that has the desred dstrbuton as ts equlbrum dstrbuton sample from ths chan samplng methods converge to the true dstrbuton but convergence s slow and hard to detect 36

37 37

### THE TITANIC SHIPWRECK: WHO WAS

THE TITANIC SHIPWRECK: WHO WAS MOST LIKELY TO SURVIVE? A STATISTICAL ANALYSIS Ths paper examnes the probablty of survvng the Ttanc shpwreck usng lmted dependent varable regresson analyss. Ths appled analyss

### What is Candidate Sampling

What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble

### CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

Lecture 3 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Next lecture: Matlab tutoral Announcements Rules for attendng the class: Regstered for credt Regstered for audt (only f there

### b) The mean of the fitted (predicted) values of Y is equal to the mean of the Y values: c) The residuals of the regression line sum up to zero: = ei

Mathematcal Propertes of the Least Squares Regresson The least squares regresson lne obeys certan mathematcal propertes whch are useful to know n practce. The followng propertes can be establshed algebracally:

### Overview. Naive Bayes Classifiers. A Sample Data Set. Frequencies and Probabilities. Connectionist and Statistical Language Processing

Overvew Nave Bayes Classfers Connectonst and Statstcal Language Processng Frank Keller keller@col.un-sb.de Computerlngustk Unverstät des Saarlandes Sample data set wth frequences and probabltes Classfcaton

### II. PROBABILITY OF AN EVENT

II. PROBABILITY OF AN EVENT As ndcated above, probablty s a quantfcaton, or a mathematcal model, of a random experment. Ths quantfcaton s a measure of the lkelhood that a gven event wll occur when the

### Naïve Bayes classifier & Evaluation framework

Lecture aïve Bayes classfer & Evaluaton framework Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square Generatve approach to classfcaton Idea:. Represent and learn the dstrbuton p x, y. Use t to defne probablstc

### Linear Regression, Regularization Bias-Variance Tradeoff

HTF: Ch3, 7 B: Ch3 Lnear Regresson, Regularzaton Bas-Varance Tradeoff Thanks to C Guestrn, T Detterch, R Parr, N Ray 1 Outlne Lnear Regresson MLE = Least Squares! Bass functons Evaluatng Predctors Tranng

### benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or

### Recurrence. 1 Definitions and main statements

Recurrence 1 Defntons and man statements Let X n, n = 0, 1, 2,... be a MC wth the state space S = (1, 2,...), transton probabltes p j = P {X n+1 = j X n = }, and the transton matrx P = (p j ),j S def.

### An Alternative Way to Measure Private Equity Performance

An Alternatve Way to Measure Prvate Equty Performance Peter Todd Parlux Investment Technology LLC Summary Internal Rate of Return (IRR) s probably the most common way to measure the performance of prvate

### Binary Dependent Variables. In some cases the outcome of interest rather than one of the right hand side variables is discrete rather than continuous

Bnary Dependent Varables In some cases the outcome of nterest rather than one of the rght hand sde varables s dscrete rather than contnuous The smplest example of ths s when the Y varable s bnary so that

### PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

14 The Ch-squared dstrbuton PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 1 If a normal varable X, havng mean µ and varance σ, s standardsed, the new varable Z has a mean 0 and varance 1. When ths standardsed

### Today s class. Chapter 13. Sources of uncertainty. Decision making with uncertainty

Today s class Probablty theory Bayesan nference From the ont dstrbuton Usng ndependence/factorng From sources of evdence Chapter 13 1 2 Sources of uncertanty Uncertan nputs Mssng data Nosy data Uncertan

### 1 Approximation Algorithms

CME 305: Dscrete Mathematcs and Algorthms 1 Approxmaton Algorthms In lght of the apparent ntractablty of the problems we beleve not to le n P, t makes sense to pursue deas other than complete solutons

### Communication Networks II Contents

8 / 1 -- Communcaton Networs II (Görg) -- www.comnets.un-bremen.de Communcaton Networs II Contents 1 Fundamentals of probablty theory 2 Traffc n communcaton networs 3 Stochastc & Marovan Processes (SP

### 9.1 The Cumulative Sum Control Chart

Learnng Objectves 9.1 The Cumulatve Sum Control Chart 9.1.1 Basc Prncples: Cusum Control Chart for Montorng the Process Mean If s the target for the process mean, then the cumulatve sum control chart s

### THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

HE DISRIBUION OF LOAN PORFOLIO VALUE * Oldrch Alfons Vascek he amount of captal necessary to support a portfolo of debt securtes depends on the probablty dstrbuton of the portfolo loss. Consder a portfolo

### Probabilities and Probabilistic Models

Probabltes and Probablstc Models Probablstc models A model means a system that smulates an obect under consderaton. A probablstc model s a model that produces dfferent outcomes wth dfferent probabltes

### Lecture 3. 1 Largest singular value The Behavior of Algorithms in Practice 2/14/2

18.409 The Behavor of Algorthms n Practce 2/14/2 Lecturer: Dan Spelman Lecture 3 Scrbe: Arvnd Sankar 1 Largest sngular value In order to bound the condton number, we need an upper bound on the largest

### Lecture 10: Linear Regression Approach, Assumptions and Diagnostics

Approach to Modelng I Lecture 1: Lnear Regresson Approach, Assumptons and Dagnostcs Sandy Eckel seckel@jhsph.edu 8 May 8 General approach for most statstcal modelng: Defne the populaton of nterest State

### State function: eigenfunctions of hermitian operators-> normalization, orthogonality completeness

Schroednger equaton Basc postulates of quantum mechancs. Operators: Hermtan operators, commutators State functon: egenfunctons of hermtan operators-> normalzaton, orthogonalty completeness egenvalues and

### CS 2750 Machine Learning. Lecture 17a. Clustering. CS 2750 Machine Learning. Clustering

Lecture 7a Clusterng Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square Clusterng Groups together smlar nstances n the data sample Basc clusterng problem: dstrbute data nto k dfferent groups such that

### Lecture 2: Absorbing states in Markov chains. Mean time to absorption. Wright-Fisher Model. Moran Model.

Lecture 2: Absorbng states n Markov chans. Mean tme to absorpton. Wrght-Fsher Model. Moran Model. Antonna Mtrofanova, NYU, department of Computer Scence December 8, 2007 Hgher Order Transton Probabltes

### The Probit Model. Alexander Spermann. SoSe 2009

The Probt Model Aleander Spermann Unversty of Freburg SoSe 009 Course outlne. Notaton and statstcal foundatons. Introducton to the Probt model 3. Applcaton 4. Coeffcents and margnal effects 5. Goodness-of-ft

### 2.4 Bivariate distributions

page 28 2.4 Bvarate dstrbutons 2.4.1 Defntons Let X and Y be dscrete r.v.s defned on the same probablty space (S, F, P). Instead of treatng them separately, t s often necessary to thnk of them actng together

### Mean Molecular Weight

Mean Molecular Weght The thermodynamc relatons between P, ρ, and T, as well as the calculaton of stellar opacty requres knowledge of the system s mean molecular weght defned as the mass per unt mole of

### The Greedy Method. Introduction. 0/1 Knapsack Problem

The Greedy Method Introducton We have completed data structures. We now are gong to look at algorthm desgn methods. Often we are lookng at optmzaton problems whose performance s exponental. For an optmzaton

### Extending Probabilistic Dynamic Epistemic Logic

Extendng Probablstc Dynamc Epstemc Logc Joshua Sack May 29, 2008 Probablty Space Defnton A probablty space s a tuple (S, A, µ), where 1 S s a set called the sample space. 2 A P(S) s a σ-algebra: a set

### Formula of Total Probability, Bayes Rule, and Applications

1 Formula of Total Probablty, Bayes Rule, and Applcatons Recall that for any event A, the par of events A and A has an ntersecton that s empty, whereas the unon A A represents the total populaton of nterest.

### THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES

The goal: to measure (determne) an unknown quantty x (the value of a RV X) Realsaton: n results: y 1, y 2,..., y j,..., y n, (the measured values of Y 1, Y 2,..., Y j,..., Y n ) every result s encumbered

### QUANTUM MECHANICS, BRAS AND KETS

PH575 SPRING QUANTUM MECHANICS, BRAS AND KETS The followng summares the man relatons and defntons from quantum mechancs that we wll be usng. State of a phscal sstem: The state of a phscal sstem s represented

### The covariance is the two variable analog to the variance. The formula for the covariance between two variables is

Regresson Lectures So far we have talked only about statstcs that descrbe one varable. What we are gong to be dscussng for much of the remander of the course s relatonshps between two or more varables.

### Inequality and The Accounting Period. Quentin Wodon and Shlomo Yitzhaki. World Bank and Hebrew University. September 2001.

Inequalty and The Accountng Perod Quentn Wodon and Shlomo Ytzha World Ban and Hebrew Unversty September Abstract Income nequalty typcally declnes wth the length of tme taen nto account for measurement.

### CARLO SIMULATION 1 ENCE

CHAPTER Duxbury Thomson Learnng Makng Hard Decson MONTE CARLO SMULATON Thrd Edton A. J. Clark School of Engneerng Department of Cvl and Envronmental Engneerng b FALL 00 By Dr. brahm. Assakkaf ENCE 67 Decson

### Questions that we may have about the variables

Antono Olmos, 01 Multple Regresson Problem: we want to determne the effect of Desre for control, Famly support, Number of frends, and Score on the BDI test on Perceved Support of Latno women. Dependent

### I. SCOPE, APPLICABILITY AND PARAMETERS Scope

D Executve Board Annex 9 Page A/R ethodologcal Tool alculaton of the number of sample plots for measurements wthn A/R D project actvtes (Verson 0) I. SOPE, PIABIITY AD PARAETERS Scope. Ths tool s applcable

### Chapter 7. Random-Variate Generation 7.1. Prof. Dr. Mesut Güneş Ch. 7 Random-Variate Generation

Chapter 7 Random-Varate Generaton 7. Contents Inverse-transform Technque Acceptance-Rejecton Technque Specal Propertes 7. Purpose & Overvew Develop understandng of generatng samples from a specfed dstrbuton

### The Analysis of Covariance. ERSH 8310 Keppel and Wickens Chapter 15

The Analyss of Covarance ERSH 830 Keppel and Wckens Chapter 5 Today s Class Intal Consderatons Covarance and Lnear Regresson The Lnear Regresson Equaton TheAnalyss of Covarance Assumptons Underlyng the

### x f(x) 1 0.25 1 0.75 x 1 0 1 1 0.04 0.01 0.20 1 0.12 0.03 0.60

BIVARIATE DISTRIBUTIONS Let be a varable that assumes the values { 1,,..., n }. Then, a functon that epresses the relatve frequenc of these values s called a unvarate frequenc functon. It must be true

### New bounds in Balog-Szemerédi-Gowers theorem

New bounds n Balog-Szemeréd-Gowers theorem By Tomasz Schoen Abstract We prove, n partcular, that every fnte subset A of an abelan group wth the addtve energy κ A 3 contans a set A such that A κ A and A

### Describing Communities. Species Diversity Concepts. Species Richness. Species Richness. Species-Area Curve. Species-Area Curve

peces versty Concepts peces Rchness peces-area Curves versty Indces - mpson's Index - hannon-wener Index - rlloun Index peces Abundance Models escrbng Communtes There are two mportant descrptors of a communty:

### Prediction of Disability Frequencies in Life Insurance

Predcton of Dsablty Frequences n Lfe Insurance Bernhard Köng Fran Weber Maro V. Wüthrch October 28, 2011 Abstract For the predcton of dsablty frequences, not only the observed, but also the ncurred but

### CHAPTER 14 MORE ABOUT REGRESSION

CHAPTER 14 MORE ABOUT REGRESSION We learned n Chapter 5 that often a straght lne descrbes the pattern of a relatonshp between two quanttatve varables. For nstance, n Example 5.1 we explored the relatonshp

### Chapter 3 Group Theory p. 1 - Remark: This is only a brief summary of most important results of groups theory with respect

Chapter 3 Group Theory p. - 3. Compact Course: Groups Theory emark: Ths s only a bref summary of most mportant results of groups theory wth respect to the applcatons dscussed n the followng chapters. For

### Prediction of Disability Frequencies in Life Insurance

1 Predcton of Dsablty Frequences n Lfe Insurance Bernhard Köng 1, Fran Weber 1, Maro V. Wüthrch 2 Abstract: For the predcton of dsablty frequences, not only the observed, but also the ncurred but not yet

### Realistic Image Synthesis

Realstc Image Synthess - Combned Samplng and Path Tracng - Phlpp Slusallek Karol Myszkowsk Vncent Pegoraro Overvew: Today Combned Samplng (Multple Importance Samplng) Renderng and Measurng Equaton Random

### Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

Causal, Explanatory Forecastng Assumes cause-and-effect relatonshp between system nputs and ts output Forecastng wth Regresson Analyss Rchard S. Barr Inputs System Cause + Effect Relatonshp The job of

### 1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP)

6.3 / -- Communcaton Networks II (Görg) SS20 -- www.comnets.un-bremen.de Communcaton Networks II Contents. Fundamentals of probablty theory 2. Emergence of communcaton traffc 3. Stochastc & Markovan Processes

### Sketching Sampled Data Streams

Sketchng Sampled Data Streams Florn Rusu, Aln Dobra CISE Department Unversty of Florda Ganesvlle, FL, USA frusu@cse.ufl.edu adobra@cse.ufl.edu Abstract Samplng s used as a unversal method to reduce the

### 8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

6 CHAPTER 8 COMPLEX VECTOR SPACES 5. Fnd the kernel of the lnear transformaton gven n Exercse 5. In Exercses 55 and 56, fnd the mage of v, for the ndcated composton, where and are gven by the followng

### Introduction to Regression

Introducton to Regresson Regresson a means of predctng a dependent varable based one or more ndependent varables. -Ths s done by fttng a lne or surface to the data ponts that mnmzes the total error. -

### 1. Measuring association using correlation and regression

How to measure assocaton I: Correlaton. 1. Measurng assocaton usng correlaton and regresson We often would lke to know how one varable, such as a mother's weght, s related to another varable, such as a

### Comparing Alternative System Configurations

CHAPTER 0 Comparng Alternatve System Confguratons 0. Introducton... 0. Confdence Intervals for the Dfference Between the Expected Responses of Two Systems...6 0.. A Pared-t Confdence Interval...7 0.. A

### Calculation of Sampling Weights

Perre Foy Statstcs Canada 4 Calculaton of Samplng Weghts 4.1 OVERVIEW The basc sample desgn used n TIMSS Populatons 1 and 2 was a two-stage stratfed cluster desgn. 1 The frst stage conssted of a sample

Table of Contents CHAPTER II - PATTERN RECOGNITION.... THE PATTERN RECOGNITION PROBLEM.... STATISTICAL FORMULATION OF CLASSIFIERS...6 3. CONCLUSIONS...30 UNDERSTANDING BAYES RULE...3 BAYESIAN THRESHOLD...33

### The OC Curve of Attribute Acceptance Plans

The OC Curve of Attrbute Acceptance Plans The Operatng Characterstc (OC) curve descrbes the probablty of acceptng a lot as a functon of the lot s qualty. Fgure 1 shows a typcal OC Curve. 10 8 6 4 1 3 4

### L10: Linear discriminants analysis

L0: Lnear dscrmnants analyss Lnear dscrmnant analyss, two classes Lnear dscrmnant analyss, C classes LDA vs. PCA Lmtatons of LDA Varants of LDA Other dmensonalty reducton methods CSCE 666 Pattern Analyss

### Logistic Regression. Steve Kroon

Logstc Regresson Steve Kroon Course notes sectons: 24.3-24.4 Dsclamer: these notes do not explctly ndcate whether values are vectors or scalars, but expects the reader to dscern ths from the context. Scenaro

### Solutions to the exam in SF2862, June 2009

Solutons to the exam n SF86, June 009 Exercse 1. Ths s a determnstc perodc-revew nventory model. Let n = the number of consdered wees,.e. n = 4 n ths exercse, and r = the demand at wee,.e. r 1 = r = r

### Experiment 8 Two Types of Pendulum

Experment 8 Two Types of Pendulum Preparaton For ths week's quz revew past experments and read about pendulums and harmonc moton Prncples Any object that swngs back and forth can be consdered a pendulum

### Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..

### Forecasting the Direction and Strength of Stock Market Movement

Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye cjngwe@stanford.edu mchen5@stanford.edu nanye@stanford.edu Abstract - Stock market s one of the most complcated systems

### U.C. Berkeley CS270: Algorithms Lecture 4 Professor Vazirani and Professor Rao Jan 27,2011 Lecturer: Umesh Vazirani Last revised February 10, 2012

U.C. Berkeley CS270: Algorthms Lecture 4 Professor Vazran and Professor Rao Jan 27,2011 Lecturer: Umesh Vazran Last revsed February 10, 2012 Lecture 4 1 The multplcatve weghts update method The multplcatve

### v a 1 b 1 i, a 2 b 2 i,..., a n b n i.

SECTION 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS 455 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS All the vector spaces we have studed thus far n the text are real vector spaces snce the scalars are

### g. Kaptay *, # E/7, 606 Egyetemvaros, Miskolc, Hungary 3515 (Received 24 October 2011; accepted 01 December 2011)

J o u r n a l o f J Mn Metall Sect B Metall 48 (1) B (2012) 153 159 M n n g a n d M e t a l l u r g y On the atomic masses (weights?) Of the elements g Kaptay *, # * Bay Zoltan Nonproft Ltd and Unversty

### Characterization of Assembly. Variation Analysis Methods. A Thesis. Presented to the. Department of Mechanical Engineering. Brigham Young University

Characterzaton of Assembly Varaton Analyss Methods A Thess Presented to the Department of Mechancal Engneerng Brgham Young Unversty In Partal Fulfllment of the Requrements for the Degree Master of Scence

### Graph Theory and Cayley s Formula

Graph Theory and Cayley s Formula Chad Casarotto August 10, 2006 Contents 1 Introducton 1 2 Bascs and Defntons 1 Cayley s Formula 4 4 Prüfer Encodng A Forest of Trees 7 1 Introducton In ths paper, I wll

### Markov Networks: Theory and Applications. Warm up

Markov Networks: Theory and Applcatons Yng Wu Electrcal Engneerng and Computer Scence Northwestern Unversty Evanston, IL 60208 yngwu@eecs.northwestern.edu http://www.eecs.northwestern.edu/~yngwu Warm up

### Control Charts for Means (Simulation)

Chapter 290 Control Charts for Means (Smulaton) Introducton Ths procedure allows you to study the run length dstrbuton of Shewhart (Xbar), Cusum, FIR Cusum, and EWMA process control charts for means usng

### ErrorPropagation.nb 1. Error Propagation

ErrorPropagaton.nb Error Propagaton Suppose that we make observatons of a quantty x that s subject to random fluctuatons or measurement errors. Our best estmate of the true value for ths quantty s then

### Evaluating credit risk models: A critique and a new proposal

Evaluatng credt rsk models: A crtque and a new proposal Hergen Frerchs* Gunter Löffler Unversty of Frankfurt (Man) February 14, 2001 Abstract Evaluatng the qualty of credt portfolo rsk models s an mportant

### Time Series Analysis in Studies of AGN Variability. Bradley M. Peterson The Ohio State University

Tme Seres Analyss n Studes of AGN Varablty Bradley M. Peterson The Oho State Unversty 1 Lnear Correlaton Degree to whch two parameters are lnearly correlated can be expressed n terms of the lnear correlaton

### Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006

Latent Class Regresson Statstcs for Psychosocal Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson (LCR) What s t and when do we use t? Recall the standard latent class model

### A Note on the Decomposition of a Random Sample Size

A Note on the Decomposton of a Random Sample Sze Klaus Th. Hess Insttut für Mathematsche Stochastk Technsche Unverstät Dresden Abstract Ths note addresses some results of Hess 2000) on the decomposton

### The Analysis of Outliers in Statistical Data

THALES Project No. xxxx The Analyss of Outlers n Statstcal Data Research Team Chrysses Caron, Assocate Professor (P.I.) Vaslk Karot, Doctoral canddate Polychrons Economou, Chrstna Perrakou, Postgraduate

### Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

Lecture 4: More classfers and classes C4B Machne Learnng Hlary 20 A. Zsserman Logstc regresson Loss functons revsted Adaboost Loss functons revsted Optmzaton Multple class classfcaton Logstc Regresson

### Can Auto Liability Insurance Purchases Signal Risk Attitude?

Internatonal Journal of Busness and Economcs, 2011, Vol. 10, No. 2, 159-164 Can Auto Lablty Insurance Purchases Sgnal Rsk Atttude? Chu-Shu L Department of Internatonal Busness, Asa Unversty, Tawan Sheng-Chang

### Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

Rsk-based Fatgue Estmate of Deep Water Rsers -- Course Project for EM388F: Fracture Mechancs, Sprng 2008 Chen Sh Department of Cvl, Archtectural, and Envronmental Engneerng The Unversty of Texas at Austn

### EE201 Circuit Theory I 2015 Spring. Dr. Yılmaz KALKAN

EE201 Crcut Theory I 2015 Sprng Dr. Yılmaz KALKAN 1. Basc Concepts (Chapter 1 of Nlsson - 3 Hrs.) Introducton, Current and Voltage, Power and Energy 2. Basc Laws (Chapter 2&3 of Nlsson - 6 Hrs.) Voltage

### Expected Value. Background

Please note: Before I slam you wth the notaton from Chapter 9 - Secton, I want you to understand how smple Mathematcal Expectaton really s. My frst smplfcaton: I wll refer to t as Expected Value (E )from

### Regression Models for a Binary Response Using EXCEL and JMP

SEMATECH 997 Statstcal Methods Symposum Austn Regresson Models for a Bnary Response Usng EXCEL and JMP Davd C. Trndade, Ph.D. STAT-TECH Consultng and Tranng n Appled Statstcs San Jose, CA Topcs Practcal

### CHAPTER 7 THE TWO-VARIABLE REGRESSION MODEL: HYPOTHESIS TESTING

CHAPTER 7 THE TWO-VARIABLE REGRESSION MODEL: HYPOTHESIS TESTING QUESTIONS 7.1. (a) In the regresson contet, the method of least squares estmates the regresson parameters n such a way that the sum of the

### MARKET SHARE CONSTRAINTS AND THE LOSS FUNCTION IN CHOICE BASED CONJOINT ANALYSIS

MARKET SHARE CONSTRAINTS AND THE LOSS FUNCTION IN CHOICE BASED CONJOINT ANALYSIS Tmothy J. Glbrde Assstant Professor of Marketng 315 Mendoza College of Busness Unversty of Notre Dame Notre Dame, IN 46556

### Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Lecture Notes for Randomzed Algorthms Luby s Alg. for Maxmal Independent Sets usng Parwse Independence Last Updated by Erc Vgoda on February, 006 8. Maxmal Independent Sets For a graph G = (V, E), an ndependent

### Implementation of Deutsch's Algorithm Using Mathcad

Implementaton of Deutsch's Algorthm Usng Mathcad Frank Roux The followng s a Mathcad mplementaton of Davd Deutsch's quantum computer prototype as presented on pages - n "Machnes, Logc and Quantum Physcs"

### Moment of a force about a point and about an axis

3. STATICS O RIGID BODIES In the precedng chapter t was assumed that each of the bodes consdered could be treated as a sngle partcle. Such a vew, however, s not always possble, and a body, n general, should

### Learning Curves for Gaussian Processes via Numerical Cubature Integration

Learnng Curves for Gaussan Processes va Numercal Cubature Integraton Smo Särkkä Department of Bomedcal Engneerng and Computatonal Scence Aalto Unversty, Fnland smo.sarkka@tkk.f Abstract. Ths paper s concerned

### NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6

PAR TESTS If a WEIGHT varable s specfed, t s used to replcate a case as many tmes as ndcated by the weght value rounded to the nearest nteger. If the workspace requrements are exceeded and samplng has

### Solution : (a) FALSE. Let C be a binary one-error correcting code of length 9. Then it follows from the Sphere packing bound that.

MATH 29T Exam : Part I Solutons. TRUE/FALSE? Prove your answer! (a) (5 pts) There exsts a bnary one-error correctng code of length 9 wth 52 codewords. (b) (5 pts) There exsts a ternary one-error correctng

### VLSI Technology Dr. Nandita Dasgupta Department of Electrical Engineering Indian Institute of Technology, Madras

VLI Technology Dr. Nandta Dasgupta Department of Electrcal Engneerng Indan Insttute of Technology, Madras Lecture - 11 Oxdaton I netcs of Oxdaton o, the unt process step that we are gong to dscuss today

### Lecture 5,6 Linear Methods for Classification. Summary

Lecture 5,6 Lnear Methods for Classfcaton Rce ELEC 697 Farnaz Koushanfar Fall 2006 Summary Bayes Classfers Lnear Classfers Lnear regresson of an ndcator matrx Lnear dscrmnant analyss (LDA) Logstc regresson

### Comment on Rotten Kids, Purity, and Perfection

Comment Comment on Rotten Kds, Purty, and Perfecton Perre-André Chappor Unversty of Chcago Iván Wernng Unversty of Chcago and Unversdad Torcuato d Tella After readng Cornes and Slva (999), one gets the

### A Probabilistic Theory of Coherence

A Probablstc Theory of Coherence BRANDEN FITELSON. The Coherence Measure C Let E be a set of n propostons E,..., E n. We seek a probablstc measure C(E) of the degree of coherence of E. Intutvely, we want

### LETTER IMAGE RECOGNITION

LETTER IMAGE RECOGNITION 1. Introducton. 1. Introducton. Objectve: desgn classfers for letter mage recognton. consder accuracy and tme n takng the decson. 20,000 samples: Startng set: mages based on 20

### Multivariate EWMA Control Chart

Multvarate EWMA Control Chart Summary The Multvarate EWMA Control Chart procedure creates control charts for two or more numerc varables. Examnng the varables n a multvarate sense s extremely mportant

### SIX WAYS TO SOLVE A SIMPLE PROBLEM: FITTING A STRAIGHT LINE TO MEASUREMENT DATA

SIX WAYS TO SOLVE A SIMPLE PROBLEM: FITTING A STRAIGHT LINE TO MEASUREMENT DATA E. LAGENDIJK Department of Appled Physcs, Delft Unversty of Technology Lorentzweg 1, 68 CJ, The Netherlands E-mal: e.lagendjk@tnw.tudelft.nl

### Approximating Cross-validatory Predictive Evaluation in Bayesian Latent Variables Models with Integrated IS and WAIC

Approxmatng Cross-valdatory Predctve Evaluaton n Bayesan Latent Varables Models wth Integrated IS and WAIC Longha L Department of Mathematcs and Statstcs Unversty of Saskatchewan Saskatoon, SK, CANADA