Newsletter Signup
Stay informed with the
NEW Casino City Times newsletter! Recent Articles
Best of Donald Catlin
|
Gaming Guru
Sample size3 December 2006
Wow, Jim, that's quite a plateful. Let me first address the topics with which I can't help you. I don't see how to define mathematical quantities that correspond to your X% and Y%. There is, of course, the classical Gambler's Ruin problem, which determines probabilities of crossing various bankroll thresholds, but that doesn't seem to be what you are asking. I'll keep thinking about it but I'm not making any promises. I'm happy that you have had such good luck with the 5 count. You should realize, however, that as I showed and Frank has openly reiterated, unless you are using a controlled shot, or some other player at the table is using a controlled shot, your expected result will be a 1.414...% loss of your Pass Line wagers (the odds are even money). The 5 count simply reduces the number of such wagers in a given time period. Now, about the sample size, let's look at a problem for which we already know the answer, namely, a Pass Line wager with a random shooter. The question is how many hands (not rolls) must we play to have a (say) 95% chance of having an empirical estimate of the correct 1.414...% that is within error e? The sample space is {Pass, Don't Pass}. We define a sequence of random variables Xi where i = 1, 2, ..., n as follows: Xi = 1 if the outcome is Pass and 0 if the outcome is Don't Pass. Similarly we define a sequence Yi such that Yi = 1 if the outcome is Don't Pass and 0 if the outcome is Pass. If the expected value E(Xi) of Xi is p and the expected value E(Yi) = q then we know from direct calculation (see my article The Pass Line in the archives) that p = 244/495 and q = 251/495. The expected return for the Pass Line is p – q = -7/495, which is approximately -1.414%. Clearly if we form the sum X1 + X2 + ... + Xn this represents the number of units won in our n hands. Similarly Y1 + Y2 + ... + Yn represents the amount lost. The difference in these two sums represents the net won (actually lost) and this number divided by n is an estimate of p – q. Now X1 + X2 + ... + Xn divided by n represents the average or mean of the wins and will be denoted by X; similarly Y will denote the average of the losses. Thus X – Y is an estimator for p – q. Now for each i it is easy to see that Xi + Yi = 1 so if we add all of these terms up and divide by n we have X + Y = 1 so Y = 1 – X. Thus we can replace Y by 1 – X in the expression X – Y and obtain 2X – 1. The expected return for this random variable is clearly 2p – 1. Using the above facts we can now state our objective. We want 2X – 1 to differ from 2p – 1 by less than error e. In symbols 2p – 1 – e < 2X – 1 < 2p – 1 + e (1) If we add 1 to the three terms in (1) and then divide by 2 we obtain p – e/2 < X < p + e/2 (2) or subtracting p throughout (2) -e/2 < X – p < e/2 (3) The sequence of random variables Xi is an independent sequence so the variance of the sum is the sum of the variances. Since E(Xi2) = p12 + q02 = p we have Var(Xi) = E( (Xi – p)2) = E(Xi2) – 2pE(Xi) + p2 (4) or Var(Xi) = p -2p2 + p2 = p - p2 = p(1- p) (5) It follows from (5) that Var( (1/n)(X1 + X2 + ... + Xn) =(1/n)2np(1 – p) = p(1 – p)/n (6) Thus the variance of X is the expression in (6) so the standard deviation of X is the square root of that expression. The sequence we have been dealing with is called a sequence of binomial random variables and this sequence can be approximated very accurately with the normal distribution. The expression (X – p)/sqr(p(1 – p)/n), where sqr represents the square root, is a standard normal random variable. From a standard normal table we can look up the value for 95%; it is 1.96. Hence we have the assertion that P( - 1.96 < (X – p)/sqr(p(1 – p)/n) < + 1.96) = 0.95 (7) where P represents probability. Multiplying expression (7) through by sqr(p(1 – p)/n) we obtain P( -1.96 sqr(p(1 – p)/n) < X – p < 1.96 sqr(p(1 – p)/n)) = 0.95 (8) Comparing expression (8) with (3) we see that if we want the difference between X and p to be within e/2 with probability 0.95 then we better set e/2 = 1.96 sqr(p(1 – p)/n)) (9) Solving (9) for n we obtain n = 4(1.96)2p(1 – p)/e2 (10) So, we know p and can calculate p(1 – p); it is 0.24995. If we want our estimate to be within a tenth of a percent then e = 0.001. Substituting these into expression (10) we get 3,840,831. I guess my 200 million was sufficient but I don't think 30 to 50 trials will do the job. In fact 200 million gives me an accuracy of better than 0.0003. Incidentally, in (10) if you want 99% confidence, replace 1.96 by 2.525. If any of you have questions I can be reached at 711cat@comcast.net. See you next month. This article is provided by the Frank Scoblete Network. Melissa A. Kaplan is the network's managing editor. If you would like to use this article on your website, please contact Casino City Press, the exclusive web syndication outlet for the Frank Scoblete Network. To contact Frank, please e-mail him at fscobe@optonline.net. Recent Articles
Best of Donald Catlin
Donald Catlin |
Donald Catlin |