P (X, Y Z) = P (X Z)P (Y Z), P (X =1, Y =1) P (X =1)P (Y =1).

Similar documents
Bayesian Networks. Mausam (Slides by UW-AI faculty)

Bayesian Networks. Read R&N Ch Next lecture: Read R&N

CRITERIUM FOR FUNCTION DEFININING OF FINAL TIME SHARING OF THE BASIC CLARK S FLOW PRECEDENCE DIAGRAMMING (PDM) STRUCTURE

OPTIMIZING WEB SERVER'S DATA TRANSFER WITH HOTLINKS

A Quantitative Approach to the Performance of Internet Telephony to E-business Sites

Life of A Knowledge Base (KB)

Vectors Math 122 Calculus III D Joyce, Fall 2012

Leveraging Multipath Routing and Traffic Grooming for an Efficient Load Balancing in Optical Networks

Spam Filtering Using Genetic Algorithm: A Deeper Analysis

Math 141. Lecture 2: More Probability! Albyn Jones 1. jones/courses/ Library 304. Albyn Jones Math 141

6.3 Conditional Probability and Independence

An introduction to our Mortgage & Finance Broking service. We ll find the loan that s right for you

XXX. Problem Management Process Guide. Process Re-engineering Problem Management Process

How to Set Up Your Referral Maker CRM

12. Inner Product Spaces

A simple analysis of the TV game WHO WANTS TO BE A MILLIONAIRE? R

LCs for Binary Classification

Expert Systems with Applications

REDUCING RISK OF HAND-ARM VIBRATION INJURY FROM HAND-HELD POWER TOOLS INTRODUCTION

Topic 4: Introduction to Labour Market, Aggregate Supply and AD-AS model

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber

Resource Allocation for Security Services in Mobile Cloud Computing

A Robust Statistical Scheme to Monitor Transient Phenomenon in Sensor Networks

Labor Demand. 1. The Derivation of the Labor Demand Curve in the Short Run:

1.2 Cultural Orientation and Consumer Behavior

Oracle Scheduling: Controlling Granularity in Implicitly Parallel Languages

THE EVALUATION OF NEW MEDIA IN EDUCATION: KEY QUESTIONS OF AN E-LEARNING MEASUREMENT STRATEGY

Expected Shortfall: a natural coherent alternative to Value at Risk

15-381: Artificial Intelligence. Probabilistic Reasoning and Inference

Bayesian Tutorial (Sheet Updated 20 March)

wwwww BASW Continuing Professional Development (CPD) Policy

Relational Dynamic Bayesian Networks: a report. Cristina Manfredotti

REPEATED TRIALS. The probability of winning those k chosen times and losing the other times is then p k q n k.

HARMONIC PROGRESSIONS DOMINANT TO TONIC MOTION. V I/i

VIP Seller ProgramTm. WhalenGroup R E A L E S T A T E A D V I S O R. step system. to get your home sold fast and for top dollar

Chapter 4. Probability and Probability Distributions

Examples of adverse selection: Moral Hazard A simple job search model Unemployment Insurance. Risks of Selling a Life Annuity.

Colored Hats and Logic Puzzles

Baruch v Baxter Healthcare Corp NY Slip Op 30337(U) January 24, 2014 Sup Ct, NY County Docket Number: /12 Judge: Sherry Klein Heitler

Framework for the Development of Food Safety Program Tools July 2001

Personalized Web Advertising Method

The University of Toledo Soil Mechanics Laboratory

USE OF INFORMATION TECHNOLOGY FOR FINANCIAL MANAGEMENT IN CZECH ENTERPRISES

3 Game Theory: Basic Concepts

Solutions to Homework Section 3.7 February 18th, 2005

Optimal Sequential Paging in Cellular Networks

23. RATIONAL EXPONENTS

Machine Learning.

Machine Learning. CS 188: Artificial Intelligence Naïve Bayes. Example: Digit Recognition. Other Classification Tasks

Network Traffic Prediction Based on the Wavelet Analysis and Hopfield Neural Network

Fundamentals Keyboard Exam

COMP 250 Fall 2012 lecture 2 binary representations Sept. 11, 2012

The Virtual Spring Mass System

ECE302 Spring 2006 HW1 Solutions January 16,

ROLE OF METADATA IN DIGITAL RESOURCE MANAGEMENT

Early Warning Indicators of Strategic Risk in Togolese Commercial Banks: An AHP Model Approach

Mind the Duality Gap: Logarithmic regret algorithms for online optimization

Unemployment and Crime. Kangoh Lee Department of Economics San Diego State University 5500 Campanile Drive San Diego, CA

MAPPING OF TRADITIONAL SOFTWARE DEVELOPMENT METHODS TO AGILE METHODOLOGY

Statistical Machine Translation: IBM Models 1 and 2

DO NOT MAIL TO COLLEGES

World Service Office PO Box 9999 Van Nuys, CA USA. World Service Office Europe 48 Rue de l Eté B-1050 Brussels, Belgium

CLINICAL NEGLIGENCE IN THE NHS IN WALES. Report by the National Audit Office on behalf of the Auditor General for Wales

Consumers face constraints on their choices because they have limited incomes.

Part III: Machine Learning. CS 188: Artificial Intelligence. Machine Learning This Set of Slides. Parameter Estimation. Estimation: Smoothing

Conditional Probability, Hypothesis Testing, and the Monty Hall Problem

Using Preferred Outcome Distributions to Estimate Value and. Probability Weighting Functions in Decisions under Risk

Chapter 14. Three-by-Three Matrices and Determinants. A 3 3 matrix looks like a 11 a 12 a 13 A = a 21 a 22 a 23

PART 3 MODULE 3 CLASSICAL PROBABILITY, STATISTICAL PROBABILITY, ODDS

Ch5: Discrete Probability Distributions Section 5-1: Probability Distribution

STATISTICS 8: CHAPTERS 7 TO 10, SAMPLE MULTIPLE CHOICE QUESTIONS

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Compression algorithm for Bayesian network modeling of binary systems

Solving Systems of Linear Equations With Row Reductions to Echelon Form On Augmented Matrices. Paul A. Trogdon Cary High School Cary, North Carolina

Scheduling agents using forecast call arrivals at Hydro-Québec s call centers

Unit 9 Describing Relationships in Scatter Plots and Line Graphs

Exam 3 Review/WIR 9 These problems will be started in class on April 7 and continued on April 8 at the WIR.

Pre-Algebra Lecture 6

Experimental Uncertainty and Probability

Fractions to decimals

Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Business Statistics 41000: Probability 1

Graphs. Exploratory data analysis. Graphs. Standard forms. A graph is a suitable way of representing data if:

Beam Deflections: 4th Order Method and Additional Topics

Chapitre 10. Flow Nets 1

STAT 35A HW2 Solutions

Lecture 9: Bayesian hypothesis testing

Selling T-shirts and Time Shares in the Cloud

ON THE LIMITS OF TEXT FILE COMPRESSION. V. y. SHEN and M. H. HALSTEAD. Computer Sciences Department Purdue University w. Lafayette, IN 47907

Case-control studies. Alfredo Morabia

7.S.8 Interpret data to provide the basis for predictions and to establish

In Proceedings of the Eleventh Conference on Biocybernetics and Biomedical Engineering, pages , Warsaw, Poland, December 2-4, 1999

Nomura Home Equity Loan Trust, Inc. v Nomura Credit & Capital, Inc NY Slip Op 32604(U) July 18, 2014 Supreme Court, New York County Docket

Institut für angewandtes Stoffstrommanagement (IfaS) Competive Innovation

Homework 4 - KEY. Jeff Brenion. June 16, Note: Many problems can be solved in more than one way; we present only a single solution here.

6th Grade Lesson Plan: Probably Probability

Bayes and Naïve Bayes. cs534-machine Learning

Lab 11. Simulations. The Concept

A Numeracy Refresher

Transcription:

CSE 150. Assignment 2 Summer 2016 Out: Thu Jun 30 Due: Tue Jul 05 beginning of class Supplementary reading: RN, Ch 14.1-14.3; KN, Ch 2.1-2.3. 1.1 Creative riting Attach events to the binary random variables X, Y, and Z that are consistent ith the folloing patterns of commonsense reasoning. You may use different events for the different parts of the problem. a Explaining aay: b Accumulating evidence: c Conditional independence: 1.2 Probabilistic inference P X =1 Y =1 > P X =1, P X =1 Y =1, Z =1 < P X =1 Y =1 P X =1 < P X =1 Y =1 < P X =1 Y =1, Z =1 P X, Y Z = P X ZP Y Z, P X =1, Y =1 P X =1P Y =1. Recall the probabilistic model that e described in class for the binary random variables {E = Earthquake, B = Burglary, A = Alarm, J = JohnCalls, M = MaryCalls}. We also expressed this model as a belief netork, ith the directed acyclic graph DAG and conditional probability tables CPTs shon belo: PE=1 = 0.002 Earthquake Burglar PB=1 = 0.001 Alarm PA=1 E=0,B=0 = 0.001 PA=1 E=0,B=1 = 0.94 PA=1 E=1,B=0 = 0.29 PA=1 E=1,B=1 = 0.95 PJ=1 A=0 = 0.05 PJ=1 A=1 = 0.90 John Calls Mary Calls PM=1 A=0 = 0.01 PM=1 A=1 = 0.70 Compute numeric values for the folloing probabilities, exploiting relations of marginal and conditional independence as much as possible to simplify your calculations. You may re-use numerical results from lecture, but otherise sho your ork. Be careful not to drop significant digits in your anser. a P E =1 A=1 c P A=1 M =1 e P A=1 M =0 b P E =1 A=1, B =0 d P A=1 M =1, J =0 f P A=1 M =0, B =1 Consider your results in b versus a, d versus c, and f versus e. Do they seem consistent ith commonsense patterns of reasoning? 1

1.3 Probabilistic reasoning A patient is knon to have contracted a rare disease hich comes in to forms, represented by the values of a binary random variable X {0, 1}. Symptoms of the disease are represented by the binary random variables Y k {0, 1}, and knoledge of the disease is summarized by the belief netork: X... Y 1 Y 2 Y 3 Y n The conditional probability tables CPTs for this belief netork are as follos. In the absence of evidence, both forms of the disease are equally likely, ith prior probabilities: P X =0 = P X =1 = 1 2. In the first form of the disease X = 0, all the symptoms are uniformly likely to be observed, ith P Y k =0 X =0 = 1 2 for all k. By contrast, in the second form of the disease X = 1, the first symptom occurs ith probability one, P Y 1 =1 X =1 = 1, hile the k th symptom ith k 2 occurs ith probability P Y k =1 X =1 = fk 1, fk here the function fk is defined by fk = 2 k + 1 k. Suppose that on the k th day of the month, a test is done to determine hether the patient is exhibiting the k th symptom, and that each such test returns a positive result. Thus, on the k th day, the doctor observes the patient ith symptoms {Y 1 =1, Y 2 =1,..., Y k =1}. Based on the cumulative evidence, the doctor makes a ne diagnosis each day by computing the ratio: r k = P X =1 Y 1 =1, Y 2 =1,..., Y k =1 P X =0 Y 1 =1, Y 2 =1,..., Y k =1. If this ratio is greater than 1, the doctor diagnoses the patient ith the X =1 form of the disease; otherise, ith the X =0 form. a Compute the ratio r k as a function of k. Ho does the doctor s diagnosis depend on the day of the month? Sho your ork. b Does the diagnosis become more or less certain as more symptoms are observed? Explain. 2

1.4 Hangman Consider the belief netork shon belo, here the random variable W stores a five-letter ord and the random variable L i {A, B,..., Z} reveals only the ord s ith letter. Also, suppose that these five-letter ords are chosen at random from a large corpus of text according to their frequency: P W = = COUNT COUNT, here COUNT denotes the number of times that appears in the corpus and here the denominator is a sum over all five-letter ords. Note that in this model the conditional probability tables for the random variables L i are particularly simple: { 1 if l is the ith letter of, P L i =l W = = 0 otherise. No imagine a game in hich you are asked to guess the ord one letter at a time. The rules of this game are as follos: after each letter A through Z that you guess, you ll be told hether the letter appears in the ord and also here it appears. Given the evidence that you have at any stage in this game, the critical question is hat letter to guess next. W L 1 L 2 L 3 L 4 L 5 Let s ork an example. Suppose that after three guesses the letters D, I, M you ve learned that the letter I does not appear, and that the letters D and M appear as follos: M D M No consider your next guess: call it l. In this game the best guess is the letter l that maximizes P L 2 =l or L 4 =l L 1 =M, L 3 =D, L 5 =M, L 2 {D, I, M}, L 4 {D, I, M}. In other orks, pick the letter l that is most likely to appear in the blank unguessed spaces of the ord. For any letter l e can compute this probability as follos: P L 2 =l or L 4 =l L 1 =M, L 3 =D, L 5 =M, L 2 {D, I, M}, L 4 {D, I, M} = P W =, L 2 =l or L 4 =l L 1 =M, L 3 =D, L 5 =M, L 2 {D, I, M}, L 4 {D, I, M}, marginalization = P W = L 1 =M, L 3 =D, L 5 =M, L 2 {D, I, M}, L 4 {D, I, M} P L 2 =l or L 4 =l W = product rule & CI 3

here in the third line e have exploited the conditional independence CI of the letters L i given the ord W. Inside this sum there are to terms, and they are both easy to compute. In particular, the second term is more or less trivial: { 1 if l is the second or fourth letter of P L 2 =l or L 4 =l W = = 0 otherise. And the first term e obtain from Bayes rule: P W = L 1 =M, L 3 =D, L 5 =M, L 2 {D, I, M}, L 4 {D, I, M} = P L 1 =M, L 3 =D, L 5 =M, L 2 {D, I, M}, L 4 {D, I, M} W = P W = P L 1 =M, L 3 =D, L 5 =M, L 2 {D, I, M}, L 4 {D, I, M} Bayes rule In the numerator of Bayes rule are to terms; the left term is equal to zero or one depending on hether the evidence is compatible ith the ord, and the right term is the prior probability P W =, as determined by the empirical ord frequencies. The denominator of Bayes rule is given by: P L 1 =M, L 3 =D, L 5 =M, L 2 {D, I, M}, L 4 {D, I, M} = P W =, L 1 =M, L 3 =D, L 5 =M, L 2 {D, I, M}, L 4 {D, I, M}, marginalization = P W =P L 1 =M, L 3 =D, L 5 =M, L 2 {D, I, M}, L 4 {D, I, M} W =, product rule here again all the right terms inside the sum are equal to zero or one. Note that the denominator merely sums the empirical frequencies of ords that are compatible ith the observed evidence. No let s consider the general problem. Let E denote the evidence at some intermediate round of the game: in general, some letters ill have been guessed correctly and their places revealed in the ord, hile other letters ill have been guessed incorrectly and thus revealed to be absent. There are to essential computations. The first is the posterior probability, obtained from Bayes rule: P W = E = P E W = P W = P E W = P W =. The second key computation is the predictive probability, based on the evidence, that the letter l appears somehere in the ord: P L i =l for some i {1, 2, 3, 4, 5} E = P L i =l for some i {1, 2, 3, 4, 5} W = P W =E. Note in particular ho the first computation feeds into the second. Your assignment in this problem is implement both of these calculations. You may program in the language of your choice. a Donload the file h2 ord counts 05.txt that appears ith the homeork assignment. The file contains a list of 5-letter ords including names and proper nouns and their counts from a large corpus of Wall Street Journal articles roughly three million sentences. From the counts in this file compute the prior probability P = COUNT/COUNT total. As a sanity check, print out the fifteen most frequent 5-letter ords, as ell as the fifteen least frequent 5-letter ords. Do your results make sense? 4

b Consider the folloing stages of the game. For each of the folloing, indicate the best next guess namely, the letter l that is most likely probable to be among the missing letters. Also report the probability P L i = l for some i {1, 2, 3, 4, 5} E for your guess l. Your ansers should fill in the last to columns of this table. Some ansers are shon so that you can check your ork. correctly guessed incorrectly guessed best next guess l P L i =l for some i {1, 2, 3, 4, 5} E {} {A, O} B E {} B E {R} H {E, I, M, N, T} {E, O} I 0.6366 D I {} A 0.8207 D I {A} E 0.7521 U {A, E, I, O, S} Y 0.6270 c Turn in a hard-copy printout of your source code. Do not forget the source code: it is orth many points on this assignment. More fun: The demo on Piazza also under resources implements this program for ords of length 6-10. You ill also find count files for ords of these lengths on Piazza, and if you modify your code to handle these different ord lengths, you ill also be able to check your ansers against the demo. This is totally optional, though. Just to be perfectly clear, you are not required in this problem to implement a user interface or any general functionality for the game of hangman. You ill only be graded on your ord lists in a, the completed table for b, and your source code in c. 5