Backgammon Rollouts

Rollouts

Confidence intervals

From:   Gerry Tesauro
Address:   tesauro@watson.ibm.com
Date:   2 February 1994
Subject:   confidence intervals for rollouts
Forum:   rec.games.backgammon
Google:   CKM3ML.BuL@hawnews.watson.ibm.com

The following article is by Stig Eide; I'm posting it for him because he doesn't have posting priveleges himself. --- CAN WE TRUST THE ROLLOUTS? In a time with an increasing number of backgammonprograms which plays a decent game, we have got a powerful tool: The rollout feature. I want to present a statistical tool that should follow any rollout: The confidence interval. What is a confidence interval? After you have performed a rollout, you'll have an estimate of the probability of a 'success'. This can be winning or losing. It doesn't matter. The confidence interval is an interval with the estimate in the centre, and you'll know how sure you can be that the probability is inside that interval. The formula: z*sqr(y/n*(1-y/n)/n)=a The variables: n is the number of rollouts. y is the number of 'successes' that occured during the rollout. y/n is the estimated probability that a 'success' occures. a is the deviance from the estimated probability y/n. The confidence interval is (y/n-a,y/n+a). z is chosen in order to tell the reliability of the confidence interval. You can choose z to be any real number, and get any confidence interval you want, but here is the 3 most used z's and their respective confidence intervals: z=1.96 gives you a 95% confidence interval z=2.17 gives you a 97% confidence interval z=3 gives you a 99.74% confidence interval So, if you choose z to be 1.96, then you can be 95% sure that the probability of a success is between y/n-a and y/n+a. EXAMPLE: You have performed 4000 rollouts of a position that occured during a game. The computer tells you that if he had played both you and your opponent, you would have won 3037 of those 6000 games (75.925%). You want to make a confidence interval that is 97% reliable. The variables: z is 2.17 y is 3037 n is 4000 a = z*sqr(y/n*(1-y/n)/n) = 2.17*sqr(3037/4000*(1-3037/4000)/4000) = 0.0128 The 97% confidence interval is now (0.75925-0.0128,0.75925+0.0128) or (0.746,0.772). This means that you can be 97% sure that the chance of winning the position is between 74.6% and 77.2%. If you want to claim that this position is either a drop or a take, you have to perform a new rollout, with more than 4000 rollouts, because that will narrow down the confidence interval (give you a smaller a). Stig Eide (stig.eide@avh.unit.no)

Did you find the information in this article useful?

Do you have any comments you'd like to add?