Forum Archive :
GNU Backgammon
Hi all,
I just finished a series of 100 7-point matches between Snowie 4 and
GNU 0.13 in order to see how they did. The series was played by Tony
Lezard's program Dueller 2.1
(http://www.jobstream.com/~tony/backgammon/) and finished 56-44 in
GNU's favor. Though it is a very nice victory for GNU, analysis (see
below) shows that the two programs are actually of exactly the same
strength.
The conditions were Snowie 4 playing at 3-ply Precise using the larger
bearin DB from the CD, whereas GNU was simply playing at Supremo (its
equivalent of Snowie's 3-ply as GNU starts counting at 0-ply) and
using the Snowie match equity table. The reason for this last detail
is so that any disagreement on doubling decisions is due to different
evaluations and not the table used. Snowie 4 rolled the dice in all
the matches.
Any wishing to receive the matches (in .mat format) need only ask,
though I expect they will no doubt be available from Tony's site very
soon. The zipped file is a mere 200k BTW.
Albert Silver
Jørn Thyssen, one of the principal authors of GNU, and certainly the
most active contributor toward the GUI development, as well as cube
and match formulae, analyzed the series of 100 7-point matches
factoring in the luck factor. The result was a very interesting tie:
The average of the luck adjustment values (for all 100 matches) is
0.500, i.e., the bots are equally good.
For each match I [Jørn] analysed the luck at 0-ply:
set priority idle
import mat ...
set analysis cube off
set analysis moves off
analyse match
show stat match
I extracted the total luck rate from the match statistics
Player Snowie4 gnubg
Luck rate (total) -1.501 (-49.717%) -0.341 (-1.415%)
In the example above gnubg won, so the luck adjusted result is
100% - 1.415% - ( -49.717% ) = 51.698%
I averaged all the luck adjusted results: 49.9947% with standard
deviation 13.5%, hence the 95% confidence interval is 50.0% +/-2.7%
(13.5% * 1.96/sqrt(100) = 2.7%).
I've attached a file which has the following entries:
game number, actual result, gnubg luck, snowie luck, luck adjusted
result
For example, for game 100 (my example above):
100 1 -.01415 -.49717 .51698
Some of the games could be very interesting to inspect carefully. For
bots of similar strength we expect luck adjusted results around 50%.
However, this is not always true in the 100 match sample you've sent
me:
Examples:
8 0 .19603 .19853 .00250
39 1 .07856 .00087 .92231
Either gnubg's luck analysis is totally wrong or snowie (gnubg) played
very bad in game 39 (game 8).
Match 39 re-analysed:
0-ply: 1 .07856 .00087 .92231
1-ply: 1 .1150 -.0151 .87449
2-ply: painfully slow; I gave up
The result is changed by 5%, but we're still far from a luck adjusted
result of 50%. I can't explain this...
Jørn
|
|
|
|
GNU Backgammon
- Analyzing GamesGrid matches (Roy Passfield, Dec 2001)
- Batch analysis tool (Øystein Johansen, June 2004)
- Cache size (Ned Cross+, Mar 2004)
- Compiling for Windows (Øystein Johansen, Jan 2002)
- Edit mode removing checker from bar (Scott Steiner+, May 2003)
- Entering an annotated match (Albert Silver, Dec 2003)
- Error rates: Gnu vs. Snowie (Raccoon, Mar 2006)
- Even-ply/odd-ply effect (Raccoon, Nov 2004)
- Even-ply/odd-ply effect (Tom Keith+, Oct 2003)
- Even-ply/odd-ply effect (Scott Steiner+, Dec 2002)
- Filter settings (Robert-Jan Veldhuizen, Nov 2004)
- Gnu 0.13 versus Jellyfish and Snowie (Torsten Schoop, Aug 2003)
- Gnu 0.13 vs. Snowie 4 (Albert Silver, June 2003)
- Gnu 0.14 vs. Jellyfish (Michael Howard+, July 2003)
- Gnu versus Snowie and Jellyfish (Michael Depreli, Oct 2005)
- How luck factor is calculated (Gregg Cattanach, Aug 2002)
- How rollouts work (Gary Wong, July 1999)
- How to enter an illegal move (Øystein Johansen, Aug 2003)
- Importing .gam files (PAR+, Mar 2005)
- Importing PartyGammon matches (rew+, July 2006)
- Improving your game using GnuBG (D.U.G.+, Nov 2002)
- Installing on Windows (maareyes, Oct 2001)
- Interpreting JSD's (Adrian Wright+, Feb 2005)
- JSD's and confidence intervals (Daniel Murphy+, Jan 2005)
- Logging rollouts (Øystein Johansen, Oct 2004)
- Luck rate (Kees van den Doel+, May 2002)
- MWC versus Equity (EMG) (Ken+, Apr 2005)
- Manually entering first roll (Andreas Graf+, Apr 2005)
- Match equity tables (Raccoon, July 2005)
- Personal reflections (Louis Nardy Pillards, Sept 2002)
- Playing two computers against each other (Stanley E. Richards+, Mar 2008)
- Python scripting (Øystein Johansen+, Nov 2004)
- Quasi-random dice in rollouts (Ian Shaw, Mar 2004)
- Question marks in game list (Jim Segrave, July 2005)
- Questions and answers (Jim Segrave+, Jan 2003)
- Questions and answers (Jørn Thyssen, Aug 2002)
- Restarting a rollout with different settings (Jim Segrave, Apr 2005)
- Restarting a rollout with different settings (Robert-Jan Veldhuizen, Apr 2004)
- Rollout settings (geoff arnold+, Apr 2007)
- Rollout settings (Stick+, Nov 2005)
- Rollout settings (Robert-Jan Veldhuizen, Mar 2004)
- Rollout settings (Ian Dunstan, Aug 2003)
- Rollout settings for the impatient (Robert-Jan Veldhuizen, June 2004)
- Running rollouts in background (Bruce+, Apr 2004)
- Saving rollout results from command-line interface (Jeremy Bagai+, Apr 2006)
- Saving rollouts (Mislav Radica+, May 2006)
- Setting GnuBG's playing strength (JP White, Sept 2001)
- Setting skill level (Jim Segrave, Apr 2004)
- Setting up and saving a rollout (Albert Silver, Dec 2003)
- What's GNU? (Gary Wong, Oct 2001)
- Which player is player 0? (Neil Kazaross+, Oct 2004)
From GammOnLine
Long message
Recommended reading
Recent addition
|
| |
|