P (X, Y Z) = P (X Z)P (Y Z), P (X =1, Y =1) P (X =1)P (Y =1).

CSE 150. Assignment 2 Summer 2016 Out: Thu Jun 30 Due: Tue Jul 05 beginning of class Supplementary reading: RN, Ch 14.1-14.3; KN, Ch 2.1-2.3. 1.1 Creative riting Attach events to the binary random variables X, Y, and Z that are consistent ith the folloing patterns of commonsense reasoning. You may use different events for the different parts of the problem. a Explaining aay: b Accumulating evidence: c Conditional independence: 1.2 Probabilistic inference P X =1 Y =1 > P X =1, P X =1 Y =1, Z =1 < P X =1 Y =1 P X =1 < P X =1 Y =1 < P X =1 Y =1, Z =1 P X, Y Z = P X ZP Y Z, P X =1, Y =1 P X =1P Y =1. Recall the probabilistic model that e described in class for the binary random variables {E = Earthquake, B = Burglary, A = Alarm, J = JohnCalls, M = MaryCalls}. We also expressed this model as a belief netork, ith the directed acyclic graph DAG and conditional probability tables CPTs shon belo: PE=1 = 0.002 Earthquake Burglar PB=1 = 0.001 Alarm PA=1 E=0,B=0 = 0.001 PA=1 E=0,B=1 = 0.94 PA=1 E=1,B=0 = 0.29 PA=1 E=1,B=1 = 0.95 PJ=1 A=0 = 0.05 PJ=1 A=1 = 0.90 John Calls Mary Calls PM=1 A=0 = 0.01 PM=1 A=1 = 0.70 Compute numeric values for the folloing probabilities, exploiting relations of marginal and conditional independence as much as possible to simplify your calculations. You may re-use numerical results from lecture, but otherise sho your ork. Be careful not to drop significant digits in your anser. a P E =1 A=1 c P A=1 M =1 e P A=1 M =0 b P E =1 A=1, B =0 d P A=1 M =1, J =0 f P A=1 M =0, B =1 Consider your results in b versus a, d versus c, and f versus e. Do they seem consistent ith commonsense patterns of reasoning? 1

1.3 Probabilistic reasoning A patient is knon to have contracted a rare disease hich comes in to forms, represented by the values of a binary random variable X {0, 1}. Symptoms of the disease are represented by the binary random variables Y k {0, 1}, and knoledge of the disease is summarized by the belief netork: X... Y 1 Y 2 Y 3 Y n The conditional probability tables CPTs for this belief netork are as follos. In the absence of evidence, both forms of the disease are equally likely, ith prior probabilities: P X =0 = P X =1 = 1 2. In the first form of the disease X = 0, all the symptoms are uniformly likely to be observed, ith P Y k =0 X =0 = 1 2 for all k. By contrast, in the second form of the disease X = 1, the first symptom occurs ith probability one, P Y 1 =1 X =1 = 1, hile the k th symptom ith k 2 occurs ith probability P Y k =1 X =1 = fk 1, fk here the function fk is defined by fk = 2 k + 1 k. Suppose that on the k th day of the month, a test is done to determine hether the patient is exhibiting the k th symptom, and that each such test returns a positive result. Thus, on the k th day, the doctor observes the patient ith symptoms {Y 1 =1, Y 2 =1,..., Y k =1}. Based on the cumulative evidence, the doctor makes a ne diagnosis each day by computing the ratio: r k = P X =1 Y 1 =1, Y 2 =1,..., Y k =1 P X =0 Y 1 =1, Y 2 =1,..., Y k =1. If this ratio is greater than 1, the doctor diagnoses the patient ith the X =1 form of the disease; otherise, ith the X =0 form. a Compute the ratio r k as a function of k. Ho does the doctor s diagnosis depend on the day of the month? Sho your ork. b Does the diagnosis become more or less certain as more symptoms are observed? Explain. 2

1.4 Hangman Consider the belief netork shon belo, here the random variable W stores a five-letter ord and the random variable L i {A, B,..., Z} reveals only the ord s ith letter. Also, suppose that these five-letter ords are chosen at random from a large corpus of text according to their frequency: P W = = COUNT COUNT, here COUNT denotes the number of times that appears in the corpus and here the denominator is a sum over all five-letter ords. Note that in this model the conditional probability tables for the random variables L i are particularly simple: { 1 if l is the ith letter of, P L i =l W = = 0 otherise. No imagine a game in hich you are asked to guess the ord one letter at a time. The rules of this game are as follos: after each letter A through Z that you guess, you ll be told hether the letter appears in the ord and also here it appears. Given the evidence that you have at any stage in this game, the critical question is hat letter to guess next. W L 1 L 2 L 3 L 4 L 5 Let s ork an example. Suppose that after three guesses the letters D, I, M you ve learned that the letter I does not appear, and that the letters D and M appear as follos: M D M No consider your next guess: call it l. In this game the best guess is the letter l that maximizes P L 2 =l or L 4 =l L 1 =M, L 3 =D, L 5 =M, L 2 {D, I, M}, L 4 {D, I, M}. In other orks, pick the letter l that is most likely to appear in the blank unguessed spaces of the ord. For any letter l e can compute this probability as follos: P L 2 =l or L 4 =l L 1 =M, L 3 =D, L 5 =M, L 2 {D, I, M}, L 4 {D, I, M} = P W =, L 2 =l or L 4 =l L 1 =M, L 3 =D, L 5 =M, L 2 {D, I, M}, L 4 {D, I, M}, marginalization = P W = L 1 =M, L 3 =D, L 5 =M, L 2 {D, I, M}, L 4 {D, I, M} P L 2 =l or L 4 =l W = product rule & CI 3

here in the third line e have exploited the conditional independence CI of the letters L i given the ord W. Inside this sum there are to terms, and they are both easy to compute. In particular, the second term is more or less trivial: { 1 if l is the second or fourth letter of P L 2 =l or L 4 =l W = = 0 otherise. And the first term e obtain from Bayes rule: P W = L 1 =M, L 3 =D, L 5 =M, L 2 {D, I, M}, L 4 {D, I, M} = P L 1 =M, L 3 =D, L 5 =M, L 2 {D, I, M}, L 4 {D, I, M} W = P W = P L 1 =M, L 3 =D, L 5 =M, L 2 {D, I, M}, L 4 {D, I, M} Bayes rule In the numerator of Bayes rule are to terms; the left term is equal to zero or one depending on hether the evidence is compatible ith the ord, and the right term is the prior probability P W =, as determined by the empirical ord frequencies. The denominator of Bayes rule is given by: P L 1 =M, L 3 =D, L 5 =M, L 2 {D, I, M}, L 4 {D, I, M} = P W =, L 1 =M, L 3 =D, L 5 =M, L 2 {D, I, M}, L 4 {D, I, M}, marginalization = P W =P L 1 =M, L 3 =D, L 5 =M, L 2 {D, I, M}, L 4 {D, I, M} W =, product rule here again all the right terms inside the sum are equal to zero or one. Note that the denominator merely sums the empirical frequencies of ords that are compatible ith the observed evidence. No let s consider the general problem. Let E denote the evidence at some intermediate round of the game: in general, some letters ill have been guessed correctly and their places revealed in the ord, hile other letters ill have been guessed incorrectly and thus revealed to be absent. There are to essential computations. The first is the posterior probability, obtained from Bayes rule: P W = E = P E W = P W = P E W = P W =. The second key computation is the predictive probability, based on the evidence, that the letter l appears somehere in the ord: P L i =l for some i {1, 2, 3, 4, 5} E = P L i =l for some i {1, 2, 3, 4, 5} W = P W =E. Note in particular ho the first computation feeds into the second. Your assignment in this problem is implement both of these calculations. You may program in the language of your choice. a Donload the file h2 ord counts 05.txt that appears ith the homeork assignment. The file contains a list of 5-letter ords including names and proper nouns and their counts from a large corpus of Wall Street Journal articles roughly three million sentences. From the counts in this file compute the prior probability P = COUNT/COUNT total. As a sanity check, print out the fifteen most frequent 5-letter ords, as ell as the fifteen least frequent 5-letter ords. Do your results make sense? 4

b Consider the folloing stages of the game. For each of the folloing, indicate the best next guess namely, the letter l that is most likely probable to be among the missing letters. Also report the probability P L i = l for some i {1, 2, 3, 4, 5} E for your guess l. Your ansers should fill in the last to columns of this table. Some ansers are shon so that you can check your ork. correctly guessed incorrectly guessed best next guess l P L i =l for some i {1, 2, 3, 4, 5} E {} {A, O} B E {} B E {R} H {E, I, M, N, T} {E, O} I 0.6366 D I {} A 0.8207 D I {A} E 0.7521 U {A, E, I, O, S} Y 0.6270 c Turn in a hard-copy printout of your source code. Do not forget the source code: it is orth many points on this assignment. More fun: The demo on Piazza also under resources implements this program for ords of length 6-10. You ill also find count files for ords of these lengths on Piazza, and if you modify your code to handle these different ord lengths, you ill also be able to check your ansers against the demo. This is totally optional, though. Just to be perfectly clear, you are not required in this problem to implement a user interface or any general functionality for the game of hangman. You ill only be graded on your ord lists in a, the completed table for b, and your source code in c. 5