Rollouts

 Standard error and JSD

 From: Stick Address: checkmugged@yahoo.com Date: 15 October 2007 Subject: Explaining (J)SD w/GNU & rollouts Forum: BGonline.org Forums

```Pretend I know little to nothing about statistical significance (Chuck can
vouch for that), standard deviations, joint standard deviations, etc ...
can you explain it using the above example/rollout? [or know of a link
where it has already been done, I can't find one with what I'm looking for]

13  14  15  16  17  18      19  20  21  22  23  24
+---+---+---+---+---+---+---+---+---+---+---+---+---+
| X               O     |   | O   O           X   X |
| X               O     |   | O   O                 |
| X                     |   | O                     |
| X                     |   | O                     |
|                       |   |                       |   +---+
|                       |   |                       |   | 1 |
|                       |   |                       |   +---+
| O                     |   | X                     |
| O                     |   | X                     |
| O               X     |   | X                     |
| O               X     |   | X                   O |
| O   X           X     |   | X                   O |
+---+---+---+---+---+---+---+---+---+---+---+---+---+
12  11  10   9   8   7       6   5   4   3   2   1

X rolls 1-1.

Rollout:

Play                   Equity     Diff.    Std.Err.
--------------------   -------   -------   --------
1.  8/7(2), 6/5(2)         +0.0773              0.0043
2.  23/22, 11/10, 6/5(2)   +0.0615   -0.0158    0.0042
```

 Brice  writes: ```So no one has touched this yet, guess I'll give it a shot. It's been about 5 years since my last stats course, so others should correct me if I say something stupid. The cubeful equity is listed as .0773, with a standard error (SE) of .0043. The actual position after 8/7(2), 6/5(2) has some exact equity A, but we don't know what it is, so we run trials to try to estimate it. The best guess after these trials is .0773. The standard error is a description of how good our measurement is and lets us construct confidence intervals to tell us where the actual value A might be. 1 SE around our estimate corresponds to about 68%, 2 SE corresponds to 95%*, and 3 SE is about 99.7%. So for this first position we might say There is a 68% chance that A lies in the interval (.0773-1*.0043, .0773+1*.0043) = (.0730, .0816) There is a 95% chance that A lies in the interval (.0773-2*.0043, .0773+2*.0043) = (.0687, .0859) There is a 99.7% chance that A lies in the interval (.0773-3*.0043, .0773+3*.0043) = (.0644, .0902) More trials = smaller SE = smaller intervals = better idea what the actual value is. One could make similar intervals for the second position. When you're comparing two positions, what you actually want is the difference between the two equities: we'll call this A - B. Our best guess for this difference is, not surprisingly, the difference between the guesses for A and B: .0158. It turns out you can't just add the two errors of A and B to get the error for (A - B): it's given by SE(A - B) = squareroot(SE(A)^2 + SE(B)^2)**. (This is the joint standard error.) In our case this comes out to about .00601. Suppose we were to now make a 95% confidence interval for A - B: it would be (.0158 - 2*.0060, .0158 + 2*.0060) = (.0038, .0278) The entire interval is greater than 0; that means we can be 95% sure the actual difference between the equities is positive. Actually, you can even say that we are 97.5% sure by symmetry (there is a 2.5% chance it's below .0038, and a 2.5% chance it's higher than .0278--in the latter case it's still positive)***. Instead of creating intervals, it's usually easier to go in the reverse direction: you have your estimate of A - B (.0158), you have SE(A - B) = .00601, so you can see how many (joint) standard errors you are away from 0. In this case we're at .0158/.00601 = 2.63 standard errors; you can convert this to a probability by doing an integral numerically**** to discover that there's a 99.6% chance that A is better than B, and a .4% chance that B is better than A. Reasonably good estimates to remember: 0 JSEs: 50% chance that top position is better (in other words: TCTC) 1 JSE: 84.1% chance that top position is better 2 JSEs: 97.5% chance that top position is better 3 JSEs: 99.9% chance that top position is better Of course, it is up to you what percentage you think is significant. You'd probably get eyed suspiciously claiming something is true with only 1 JSE, but with 3+ JSEs few would doubt the veracity of your claim. I hope that helped? --Bryce * 95% is actually more like 1.96 standard errors. ** This is because, while standard errors do not add, the variances (which are the squares of the standard deviation) do: Var(A-B) = Var(A+B) = Var(A) + Var(B). *** I might be confusing one- and two-tailed tests here, but it looks right to me. **** Or just use this applet: http://psych.colorado.edu/~mcclella/java/normal/normz.html and enter "2.63" or "-2.63" into the z-score box. ```

### Rollouts

Cautionary tale  (Kit Woolsey, Sept 1995)
Combining rollouts  (Gregg Cattanach+, Dec 2003)
Confidence intervals  (Bob Koca, Nov 2010)
Confidence intervals  (Timothy Chow, May 2010)
Confidence intervals  (Gerry Tesauro, Feb 1994)
Cubeless vs centered-cube rollouts  (Ron Karr, Dec 1997)
Duplicate dice  (David Montgomery, June 1998)
How reliable are rollouts?  (David Montgomery, Aug 1999)
Level-5 versus level-6 rollouts  (Michael J. Zehr, June 1998)
Level-5 versus level-6 rollouts  (Chuck Bower, Aug 1997)
Positions with inaccurate rollouts  (Douglas Zare, Oct 2002)
Reporting results of rollouts  (David Montgomery, June 1995)
Rollout settings  (Lokicol+, Apr 2010)
Settlement limit  (Michael J. Zehr, Apr 1998)
Settlement limit  (Kit Woolsey, Dec 1997)
Settlement limit in races  (Alexander Nitschke, Dec 1997)
Some guidelines  (Kit Woolsey, Apr 1996)
Standard error and JSD  (rambiz+, Feb 2011)
Standard error and JSD  (Stick+, Oct 2007)
Systematic error  (Chuck Bower, Oct 1996)
Tips for doing rollouts  (Douglas Zare, June 2002)
Truncated rollouts  (Gregg Cattanach, Oct 2002)
Truncated rollouts: pros and cons  (Jason Lee+, Jan 2006)
What is a rollout?  (Gregg Cattanach, Dec 1999)