Stay informed with the Stay informed with the
Recent Articles
Best of Donald Catlin  # Polya's Urn Scheme

3 October 2008

One of the fascinating things about Probability Theory is that sometimes the right answer runs counter to our intuition. In fact, this is the basis for many of the so-called proposition bets one sees, often in bar rooms. A classic example of this is the famous Birthday Problem. If you want to learn about this you can go to the archives of my articles on this site and check out my September 16, 1999 article entitled An Earful of Cider.

This month I want to show you an example where intuition is challenged; the example due to the noted mathematician G. Polya. Here it is.

An urn contains r red balls and b black balls. The probability of choosing a red ball is therefore r/(r + b). Now here is the twist. A ball is randomly drawn from the urn and not shown to us. The color is noted and c balls of the same color are added to the urn. We are not told what the color was. Clearly the mixture in the urn has changed although we don't know how. The question is what is the probability of choosing a red ball from the r + b + c balls in the urn?

We will need the following facts from probability theory. If A and B are events I'll denote the event that both A and B occur by A & B. The event the either A or B (or both) occur is denoted by A or B. The conditional probability that A occurs given that B occurs is denoted by P(A|B). We then have the following:

P(A & B) = P(A|B)P(B) (1)

If A and B are disjoint events, meaning that A and B cannot both occur, then

P(A or B) = P(A) + P(B) (2)

We will use the following notation. R1 is the event that a red ball is chosen on the first draw and B1 is the event that a black ball is chosen on the first draw. Similarly, R2 and B2 correspond to red or black draws on the second draw.

As noted above

P(R1) = r/(r + b) (3)

and

P(B1) = b/(r + b) (4)

If a red ball was drawn on the first draw the urn now contains r + c red balls and b black balls. Therefore

P(R2|R1) = (r + c)/(r + b + c) (5)

On the other hand if the first ball drawn was black then the urn contains r red balls and b + c black balls. Therefore

P(R2|B1) = r/(r + b + c) (6)

From relation (1) we have

P(R2 & R1) = P(R2|R1)P(R1) (7)

Substituting (3) and (5) into (7) we obtain

P(R2 & R1) = [(r + c)/(r + b + c)][r/(r + b)] (8)

Similarly,

P(R2 & B1) = P(R2|B1)P(B1) (9)

which gives us

P(R2 & B1) = [r/(r+ b + c)][b/( r + b)] (10)

Now note that R2 & R1 and R2 & B1 are disjoint events since R1 and B1 cannot both occur. Also (R2 & R1) or (R2 & B1) is just R2 since either R1 or B1 must occur on the first draw. Applying (2) we obtain.

P(R2) = P(R2 & R1) + P(R2 & B1) (11)

which by (8) and (10) gives us

P(R2) = [r(r + c) + rb] / [(r + b)(r + b + c)]

and factoring an r out of the numerator the terms (r + b + c) cancel and we are just left with

P(R2) = r/(r + b) = P(R1)

So there you have it. The probability that the second draw will be a red ball is exactly the same as the probability that the first ball drawn was red. So if this were a betting proposition a fair payoff would be b : r as long as we have no information about the color of the first ball drawn. In fact, one can show that the process can be repeated n times resulting in r + b + nc balls in the urn and as long as we have no information about the colors in the first n - 1 draws the probability of drawing a red on the nth draw will still be r/(r + b). Cute! It may surprise you to learn that this urn model has been used to model the spread of infectious diseases. I'll leave you to ponder that one. See you next month.

Don Catlin can be reached at 711cat@comcast.net

Recent Articles
Best of Donald Catlin
Donald Catlin Don Catlin is a retired professor of mathematics and statistics from the University of Massachusetts. His original research area was in Stochastic Estimation applied to submarine navigation problems but has spent the last several years doing gaming analysis for gaming developers and writing about gaming. He is the author of The Lottery Book, The Truth Behind the Numbers published by Bonus books.

#### Books by Donald Catlin:

Lottery Book: The Truth Behind the Numbers
Donald Catlin
Don Catlin is a retired professor of mathematics and statistics from the University of Massachusetts. His original research area was in Stochastic Estimation applied to submarine navigation problems but has spent the last several years doing gaming analysis for gaming developers and writing about gaming. He is the author of The Lottery Book, The Truth Behind the Numbers published by Bonus books.

#### Books by Donald Catlin:

Lottery Book: The Truth Behind the Numbers