The Stable Marriage Problem



Similar documents
Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

Properties of MLE: consistency, asymptotic normality. Fisher information.

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

Overview of some probability distributions.

I. Chi-squared Distributions


Math C067 Sampling Distributions

CS103X: Discrete Structures Homework 4 Solutions

Incremental calculation of weighted mean and variance

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13

Lesson 15 ANOVA (analysis of variance)

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem

Sequences and Series

Soving Recurrence Relations

UC Berkeley Department of Electrical Engineering and Computer Science. EE 126: Probablity and Random Processes. Solutions 9 Spring 2006

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).

A probabilistic proof of a binomial identity

Lesson 17 Pearson s Correlation Coefficient

Determining the sample size

Maximum Likelihood Estimators.

Section 11.3: The Integral Test

Hypothesis testing. Null and alternative hypotheses

1. C. The formula for the confidence interval for a population mean is: x t, which was

1 Computing the Standard Deviation of Sample Means

3. Greatest Common Divisor - Least Common Multiple

A Recursive Formula for Moments of a Binomial Distribution

Repeating Decimals are decimal numbers that have number(s) after the decimal point that repeat in a pattern.

Hypergeometric Distributions

Chapter 14 Nonparametric Statistics

SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx

How To Solve The Homewor Problem Beautifully

Department of Computer Science, University of Otago

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means)

LECTURE 13: Cross-validation

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Project Deliverables. CS 361, Lecture 28. Outline. Project Deliverables. Administrative. Project Comments

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles

A Mathematical Perspective on Gambling

1. MATHEMATICAL INDUCTION

THE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction

MARTINGALES AND A BASIC APPLICATION

Confidence Intervals for One Mean

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.

Convexity, Inequalities, and Norms

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean

Sampling Distribution And Central Limit Theorem

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book)

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations

Chapter 7 Methods of Finding Estimators

THE ABRACADABRA PROBLEM

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

3 Basic Definitions of Probability Theory

Approximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find

Asymptotic Growth of Functions

Infinite Sequences and Series

1 Correlation and Regression Analysis

5 Boolean Decision Trees (February 11)

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

Lecture 2: Karger s Min Cut Algorithm

Lecture 3. denote the orthogonal complement of S k. Then. 1 x S k. n. 2 x T Ax = ( ) λ x. with x = 1, we have. i = λ k x 2 = λ k.

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies ( 3.1.1) Limitations of Experiments. Pseudocode ( 3.1.2) Theoretical Analysis

5.3. Generalized Permutations and Combinations

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM

Amendments to employer debt Regulations

A Combined Continuous/Binary Genetic Algorithm for Microstrip Antenna Design

FIBONACCI NUMBERS: AN APPLICATION OF LINEAR ALGEBRA. 1. Powers of a matrix

Normal Distribution.

Output Analysis (2, Chapters 10 &11 Law)

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Modified Line Search Method for Global Optimization

Basic Elements of Arithmetic Sequences and Series

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 8

5: Introduction to Estimation

THE HEIGHT OF q-binary SEARCH TREES

BINOMIAL EXPANSIONS In this section. Some Examples. Obtaining the Coefficients

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection

INVESTMENT PERFORMANCE COUNCIL (IPC)

COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS

Chapter 7: Confidence Interval and Sample Size

Solutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork

Research Method (I) --Knowledge on Sampling (Simple Random Sampling)

The Power of Free Branching in a General Model of Backtracking and Dynamic Programming Algorithms

Measures of Spread and Boxplots Discrete Math, Section 9.4

Ekkehart Schlicht: Economic Surplus and Derived Demand

Confidence intervals and hypothesis tests

2-3 The Remainder and Factor Theorems

NATIONAL SENIOR CERTIFICATE GRADE 12

hp calculators HP 12C Statistics - average and standard deviation Average and standard deviation concepts HP12C average and standard deviation

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return

Our aim is to show that under reasonable assumptions a given 2π-periodic function f can be represented as convergent series

Transcription:

The Stable Marriage Problem William Hut Lae Departmet of Computer Sciece ad Electrical Egieerig, West Virgiia Uiversity, Morgatow, WV William.Hut@mail.wvu.edu 1 Itroductio Imagie you are a matchmaker, with oe hudred female cliets, ad oe hudred male cliets. Each of the wome has give you a complete list of the hudred me, ordered by her preferece: her first choice, secod choice, ad so o. Each of the me has give you a list of the wome, raked similarly. It is your job to arrage oe hudred happy marriages. I this problem, we have a set of me ad wome. Each perso has their ow preferece list of the persos they wat to marry. Our job is to determie a assigmet where each ma is married to oe ad oly oe woma (moogamous ad heterosexual). Each ma, deoted by the list (A,B,C,...), has a list of wome (a,b,c,...) ordered by his preferece as i figure 1. Each woma has a similarly raked list. Every ma is o every woma s list ad every woma is o every ma s list. The goal is to have a set of stable marriages, M, betwee the me ad the wome. Whe give a married pair, X-a ad Y-b, if ma X prefers aother woma b more tha his curret wife a ad woma b prefers X more tha her curret ma Y, the X-b is called a dissatisfied pair. Figure 1: Sample Preferece List for Me ad Wome 1

Figure 2: Sample Stable Marriage for Me ad Wome The figure 2 shows a sta- The marriage M is said to be a stable marriage if there are o dissatisfied pairs. ble marriage for the preferece lists give i figure 1. The simplest approach to solvig this problem is the followig: Fuctio Simple-Proposal-But-Ivalid 1: Start with some assigmet betwee the me ad wome 2: loop 3: if assigmet is stable the 4: stop 5: else 6: fid a dissatisfied pair ad swap mates to satisfied the pair 7: ed if 8: ed loop Algorithm 1.1: A Ivalid Simple Algorithm for Proposal This will NOT work sice a loop ca occur. Swaps ca be made that might cotiually result i dissatisfied pairs. We ca come up with a equally simple, determiistic algorithm. 2

1.1 Proposal Algorithm Fuctio Proposal-Algorithm 1: while there is a upaired ma do 2: pick a upaired ma X ad the first woma w o his list 3: remove w from his list so it wo t be picked agai 4: if w is egaged the 5: if w prefers X more tha her curret parter Y the 6: set X-w as married 7: set Y-w as umarried so ow Y is upaired 8: else 9: X is still upaired sice w is happier with Y 10: ed if 11: else 12: the woma was ot previously paired so accept immediately, X-w, as married 13: ed if 14: ed while Algorithm 1.2: Proposal Algorithm At each iteratio, the me will elimiate oe woma from their list. are at most 2 proposals. Sice each list has elemets, there Now, we have a few questios to ask regardig this algorithm. 1. Does the algorithm termiate? Oce a woma becomes attached, she remais married, but ca chage a parter for a better mate that proposes to her. That makes this algorithm a greedy algorithm for the wome. A ma will elimiate a choice from his list durig each iteratio, thus if the rouds cotiue log eough, he will get rid of his etire preferece list etries ad there will be o oe left to propose too. Therefore all wome ad me are married ad the algorithm termiates. 2. Is the resultig marriage a stable marriage? To show that it is a stable marriage, let s assume we have a dissatisfied pair, X-b, where i the marriage they are paired as X-a ad Y-b. Sice X prefers woma b over his curret parter a, the he must have proposed to b before a. Woma b either rejected him or accepted him, but dropped him for aother better ma tha X. Thus, b must prefer Y to X, cotradictig our assumptio that b is dissatisfied, so it is a stable marriage. 3

1.1.1 Probabilistic Aalysis The followig is a average-case aalysis of the Proposal Algorithm. Let T P = umber of proposals made Sice T P has a lot of depedecies durig each step, it is difficult to aalyze it. I this aalysis, we are goig to assume that the me s lists are chose uiformly, idepedetly, ad at radom over all iput. The wome s lists are arbitrary, but fixed i advace. Sice there are! differet lists, the probability that a ma will get a particular sequece is 1!. We are goig to argue that the expected value of the umber of proposals is roughly O( l ). 1.1.2 Priciple of Deferred Decisios To argue about the expected value, we are goig to use the techique of the Priciple of Deferred Decisios. This priciple uses a idea that radom choices are ot all made i advace but the algorithm makes radom choices as it eeds them. A illustratio of this techique is the Clock Solitaire Game. I this game, you have a shuffled deck of 52 cards. Four cards are dealt ito 13 piles. Each pile is amed with a distict member of A,1,2,3,...,J,Q,K. O the first move, draw a card from the K pile. The followig draws come from the pile amed by the face value of the card from the previous draw. The game eds whe you try to draw from a empty pile. If all cards are draw, the you wi. Will this game ed? Yes. It will always ed with a kig i your had. There are 4 differet cards for each suit i each pile except for the kig pile because you started with that particular pile. Therefore, there is possibility of edig the game by drawig all cards from the piles with the last card draw beig a kig. To determie the probability of wiig, we eed to cosider that every time a card is draw, a ew depedecy occurs. To calculate this is tough. Aother way of determiig the probability of wiig is to thik of the game as drawig cards, oe after aother without replacemet, at radom from the deck of cards. To wi the game, we 4 kigs eed the probability that the 52d card draw is a kig. Thus, the wiig probability is 52 cards i deck = 1 13. 4

1.2 Amesiac Algorithm I the aalysis of the Proposal Algorithm, we ca simplify by assumig that me geerate their lists by geeratig oe woma at a time out of the wome that have t rejected him. A problem that arises is that a woma s choice depeds o the ma that proposes to her. To resolve the woma s depedecy problem, we ca modify the behavior of our algorithm. We ca have the ma geerate his list by selectig a woma uiformly at radom from all wome, icludig those that have rejected him. He has forgotte the fact that wome have already rejected him, thus the Amesiac Algorithm. This is easy to aalyze because we are dealig with the total umber of proposals oly because each proposal is idepedetly made to oe of the wome chose at radom. We ca let T A be the umber of proposals made by the Amesiac Algorithm. for all m, Pr[T A > m Pr[T P > m From above we ca see that T A stochastically domiates T P. Therefore, we do ot eed to fid a upper boud o T P (which is hard to do). Istead, we ca use the upper boud o T A (which is easy to do). Theorem: 1.1 lim Pr[T A > m = 1 e e c, for m = l + c, c R + This theorem result ca be derived usig the Coupo Collector s Problem. 1.2.1 Coupo Collector s Problem To aalyze how log the algorithm takes, we eed to fid out how may radom proposals to females eed to occur before there are o loger ayoe left to propose to. This is the same as a occupacy problem where there are m balls radomly put ito bis. That occupacy problem ca be traslated ito the supermarket realm. I this problem, there are types of coupos ad m visits to the store. At each visit, you radomly ad uiformly get a coupo. The questio is: How may visits, m, do I have to do to make sure that I have oe coupo of each type? Aalysis Let X be a radom variable defied to be the umber of trials required to collect at least 1 of each type of coupo. Let C 1, C 2,, C X deote the sequece of trials, where C i {1, 2, } is the type of the coupo draw i the i th trial. C i is cosidered a success, if the coupo type C i was NOT draw i the 1 st i 1 selectios. By this defiitio, C 1 ad C X will always be successes. 5

Divide the sequece ito epochs where epoch i begis with the trial followig the i st success ad eds with the trail o which we obtai the (i + 1) st success. So, we ca defie a radom variable, X i, with 0 i 1, as the umber of trials i the i st epoch. We ca the express X as a fuctio of X i by the followig sice we are dividig X ito differet portios: 1 X = Now we eed to aswer the followig questio: What about the distributio of each X i? Let P i be the probability of success of ay trial of the i st epoch. P i, i the occupacy problem viewpoit, is the probability of gettig a ball that has t bee draw before. Sice there have already bee i balls draw, the probability of success is: P i = i X i is geometrically distributed, therefore the followig are true by defiitio of the distributio: X i E[X i = 1 P i σ 2 X i = 1 Pi P 2 i By liearity of expectatios ad summatio of variaces of idepedet radom variables, we ca calculate E[X ad σx 2 as follows: Sice 1 σx 2 = σx 2 i = 1 i coverges to π2 2 6 1 E[X = E[X i = 1 1 1 = P i i = 1 i = H H = l + Θ(1), therefore, E[X = l + O(). 1 1 P i P 2 i = 1 i ( i) 2 = as, we have the followig limit: σx 2 lim 2 = π2 6. ( i) i 2 = 2 Our ext goal is to show that X will ot deviate far from its expected value. 1 i 2 H Let E r i deote the evet that coupo i is NOT collected i the 1st r trials. These trials are doe idepedetly ad with replacemet. ( Pr[E r i = 1 1 ) r e r. If we let r = β l, the Pr[E r i = β. Sice the probability of a uio of evets is always less tha the sum of the probabilities of those evets, we ca calculate Pr[X > r, for r = β l as: [ Pr[X > r = Pr Pr[E r i β = (β 1) E r i 6

1.2.2 Poisso Heuristic To help us show that X will ot deviate far from its expected value, we ca utilize the Poisso distributio as a approximatio of the biomial distributio. Let N r i be the umber of time coupo i is chose durig the 1 st r trials. These trials follow the biomial distributio with parameters r ad p = 1. Pr[N r i = x = ( r x ) p x (1 p) r x, with 0 x r Pr[E r i = Pr[N r i = 0 Let λ R +. A radom variable Y follows the Poisso distributio with parameter λ if for ay positive iteger y, Pr[Y = y = λy e λ Assumig that λ is small ad r, the the Poisso distributio is a good approximatio for the biomial distributio. Whe we use the Poisso distributio, we ca show that the E r i evets are idepedet. Usig this approximatio, with λ = r, the probability of the evet Er i is: Pr[E r i = Pr[N r i y! = 0 λ0 e λ 0! = e r A beefit of usig the Poisso distributio is that we ca ow say that the evets E r i, for 1 i, are almost idepedet. Claim: 1.1 For 1 i, ad for ay set of idices {j 1,, j k } ot cotaiig i, we wat to show: Pr [ E r i k E r j l l=1 Pr[E r i = e r Proof: Workig with the left had side: Pr [ E r i k Pr E r j l = l=1 [ E r ( k i l=1 [ k Pr E r j l l=1 E r j l ) by the defiitio of coditioal probabilities The umerator is the same as sayig we wat k + 1 coupos ot selected i r trials, ad the deomiator is whe we wat k coupos ot selected i r trials, givig us: ( ) 1 k+1 r = ( ) 1 k r 7

Usig the idetity: e r(k+1) e rk lim (1 + α) 1 α = e, we ca rewrite the left had side as: α = e r, thus the desired result of declarig the evets idepedet is show. A ew questio: What is the probability that all coupos are collected i the first m trials? [ ( ) [ Pr = Pr ( E m i ) by DeMorga s Law E m i = ( ) 1 e m e e m sice they are idepedet evets. Let m = (l + c), for ay c R, the by the precedig argumet, [( ) [ Pr[X > m = (l + c) = Pr Pr ( E m i ) = 1 e e c E m i This shows that the probability that all coupos collected withi m trials is very high. There is also ot a lot of deviatio from l sice for a large positive c, the probably of e e c is close to 1 ad is egligibly small for a large egative c. Result: 1.1 Therefore, we ca coclude our aalysis of the Stable Married Problem by summarizig the followig poits: 1. The worst case of the algorithm is 2. 2. The expected (average) case is l. 3. Deviatio is small from the expected value. 8