Rollouts

Forum Archive : Rollouts

 
Cautionary tale

From:   Kit Woolsey
Address:   kwoolsey@netcom.com
Date:   24 September 1995
Subject:   Re: X to play 6-3
Forum:   rec.games.backgammon
Google:   kwoolseyDFFEy1.DuE@netcom.com

David Montgomery (monty@cs.umd.edu) wrote:

>   +24-23-22-21-20-19-+---+18-17-16-15-14-13-+
>   | X     X  O  O  O |   | X  O             |
>   |          O  O  O |   | X  O             |
>   |                  |   |    O             |
>   |                  |   |    O             |
>   |                  |   |                  |
>   |                  |   |                  | [1]
>   |                  |   |                  |
>   |                  |   |                O |
>   |                  |   | X              O |
>   |                  |   | X              O |
>   |          X  X  X |   | X              O |
>   |          X  X  X |   | X           X  O |
>   +-1--2--3--4--5--6-+---+-7--8--9-10-11-12-+
> Money game. X to play 6-3.
>
> Thanks for all the responses to this problem.
> This is a position from _Costa Rica 1993_.
> Wilcox Snellings played 22/13.
>
> My preference before seeing any rollouts
> or analysis was for 11/5 7/4.  This was also
> the choice of Herb Gurland, a top Boston player.
> The authors of _Costa Rica 1993_ also preferred
> 11/5 7/4.
>
> Wilcox Snelling rolled two plays out by hand
> 108 times with the following results:
>
> 11/5 7/4  -.42
> 22/13     -.50
>
> The authors rolled another play out 108 times:
>
> 11/8 24/18 -.50
>
> I rolled all of these plays out (and several others)
> 3888 times on Jellyfish (no truncation, duplicate
> dice, 3 sets of 1296 with seeds 2430, 2431, and 2432).
>
> Jellyfish cubeless equities:
>
> 22/13      -.363
> 22/16 11/8 -.405
> 22/16  7/4 -.416
> 24/18 11/8 -.446
> 11/5   7/4 -.449
> 24/18  7/4 -.456
> 24/15      -.458
>
> I wasn't really that surprised that 22/13 came out
> on top, although it wasn't the play that I would
> have made.  But I was *very* surprised that it came
> out right by so much.  This mistake actually costs
> about 2/10 of a point when the cube is figured in.
>
> I would welcome any further illumination on why
> 22/13 is so much better than the other plays,
> especially 11/5 7/4.
>
> David Montgomery
> monty on FIBS

While Jellyfish rollouts are usually accurate and quite informative,
occasionally they can give us wrong information.  One the dangers is that
the program is simply misplaying the position, and this affects one of
the plays being rolled out more than the other one.  Keep in mind that
for the rollout the program is playing with only 1-ply (that is the same
as level 5).  This is necessary for speed purposes -- to use 2-ply in the
rollouts would make the rollouts take far longer.  The program still
plays pretty well at 1-ply, but not nearly as well as 2-ply and therefore
is more likely to be doing something wrong in the play.  Most of the time
this will not matter (particularly in play vs. play problems), since
these errors in play tend to cancel out and generally are not huge
anyway.  Occasionally the two plays being tested lead to different types
of positions, where one play gives the program a chance to make an error
which the other play doesn't.

When I saw David's results, I thought this might be happening.  I thought
after playing 11/5, 7/4 the program might be making the defensive three
point if it rolled a two.  I also thought this might be the wrong
strategy -- hanging back on the ace point with the back man and springing
the other checker could be better.  So, I decided to run a test.  I had X
play 11/5, 7/4 with the 6-3, and gave O a 6-1 (played 13/6).  This left
the following position:

    13 14 15 16 17 18       19 20 21 22 23 24
   +------------------------------------------+
   |             O  X |   |  O  O  O  X     X |
   |             O  X |   |  O  O  O          |
   |             O    |   |  O                |
   |             O    |   |                   |
   |                  |   |                   |
   |                  |   |                   |
   | O                |   |                   |
   | O              X |   |     X  X          |
   | O              X |   |  X  X  X          |
   | O              X |   |  X  X  X          |
   +------------------------------------------+
    12 11 10  9  8  7        6  5  4  3  2  1

Now I gave X a 4-2 to play, and looked at Jellyfish's 1-ply opinion.  I
also rolled out the three logical plays 2952 times each, duplicate dice.
These were the results:

Play            1-ply           Rollout

24/22, 7/3      -.428           -.514
7/3, 5/3        -.501           -.486
22/18, 5/3      -.510           -.412

These results confirmed my suspicions.  Jellyfish was thematically
misplaying the position in its rollouts after playing 11/5, 7/4 with the
original 6-3.  However after playing 22/13 with the 6-3 the program
didn't have the opportunity to make this sort of misplay, since there was
no way to make the 22 point so there was no incentive to move the back
checker.  This misplay might be sufficient to turn the rollout results of
the 6-3 around, and certainly explains why 22/13 came out so much better
than 11/5, 7/4 in David's rollout.

Any time you are suspicious about the results of a rollout, it is vital
to examine how the program is playing at least the next couple of rolls
before accepting the results of the rollout as gospel.  The rollouts are
good, but we still have to keep our eyes open or we may fall into some
unexpected traps.

Kit
 
Did you find the information in this article useful?          

Do you have any comments you'd like to add?     

 

Rollouts

Advice  (David Montgomery, Apr 1996)  [Long message]
Cautionary tale  (Kit Woolsey, Sept 1995) 
Combining rollouts  (Gregg Cattanach+, Dec 2003)  [GammOnLine forum]
Confidence intervals  (Bob Koca, Nov 2010) 
Confidence intervals  (Timothy Chow, May 2010) 
Confidence intervals  (Gerry Tesauro, Feb 1994) 
Cubeless vs centered-cube rollouts  (Ron Karr, Dec 1997) 
Duplicate dice  (David Montgomery, June 1998) 
How reliable are rollouts?  (David Montgomery, Aug 1999) 
Level-5 versus level-6 rollouts  (Michael J. Zehr, June 1998) 
Level-5 versus level-6 rollouts  (Chuck Bower, Aug 1997) 
Positions with inaccurate rollouts  (Douglas Zare, Oct 2002) 
Reporting results of rollouts  (David Montgomery, June 1995) 
Rollout settings  (Lokicol+, Apr 2010) 
Settlement limit  (Michael J. Zehr, Apr 1998) 
Settlement limit  (Kit Woolsey, Dec 1997) 
Settlement limit in races  (Alexander Nitschke, Dec 1997) 
Some guidelines  (Kit Woolsey, Apr 1996) 
Standard error and JSD  (rambiz+, Feb 2011) 
Standard error and JSD  (Stick+, Oct 2007) 
Systematic error  (Chuck Bower, Oct 1996) 
Tips for doing rollouts  (Douglas Zare, June 2002) 
Truncated rollouts  (Gregg Cattanach, Oct 2002) 
Truncated rollouts: pros and cons  (Jason Lee+, Jan 2006)  [GammOnLine forum]
What is a rollout?  (Gregg Cattanach, Dec 1999) 

[GammOnLine forum]  From GammOnLine       [Long message]  Long message       [Recommended reading]  Recommended reading       [Recent addition]  Recent addition
 

  Book Suggestions
Books
Cheating
Chouettes
Computer Dice
Cube Handling
Cube Handling in Races
Equipment
Etiquette
Extreme Gammon
Fun and frustration
GNU Backgammon
History
Jellyfish
Learning
Luck versus Skill
Magazines & E-zines
Match Archives
Match Equities
Match Play
Match Play at 2-away/2-away
Miscellaneous
Opening Rolls
Pip Counting
Play Sites
Probability and Statistics
Programming
Propositions
Puzzles
Ratings
Rollouts
Rules
Rulings
Snowie
Software
Source Code
Strategy--Backgames
Strategy--Bearing Off
Strategy--Checker play
Terminology
Theory
Tournaments
Uncategorized
Variations

 

Return to:  Backgammon Galore : Forum Archive Main Page