What exactly is the standard error reported by gnu while doing rollouts?
Say you rollout out two moves and gnu reports:
move 1: 0.70 winning chance 0.005 SE
move 2: 0.68 winning chance 0.008 SE
Can some one elaborate please? Regardless of the number of games rolled
out, how sure can I be, that move one is better than the other? Please
notice, that 0.68 + 0.008 + 0.005 < 0.7. For the sake of simplicity I've
assumed a cubeless rollout at DMP with no possible gammons.


Tom Keith writes:
Suppose you roll out two plays and want to know whether they are correctly
ranked by their rollout results. (The plays could be wrongly ranked if the
poorer play had luckier dice in the rollout.).
What you can do is compue a "joint standard deviation" (JSD) of the two
plays. If the individual standard deviations are SD1 and SD2,
JSD = sqrt( SD12 + SD22 ).
Then take D, the difference between the rollout results, and divide by the
JSD. Consult the following table to find the probability the plays are
correctly ranked.
Probability the plays
D / JSD are correctly ranked
 
0.0 50%
0.5 69%
1.0 84%
1.5 93.3%
2.0 97.7%
2.5 99.4%
Your example:
If R1 = 0.70 and SD1 = 0.005,
and R2 = 0.68 and SD2 = 0.008, then
JSD = sqrt( 0.0052 + 0.0082 ) = 0.0094
D = 0.70  0.68 = 0.02
D / JSD = 0.02 / 0.0094 = 2.13
From the table, there is roughly a 98% chance that an infinite rollout
uphold the order of these plays.




