Rollouts

Forum Archive : Rollouts

 
Advice

From:   David Montgomery
Address:   monty@cs.umd.edu
Date:   7 April 1996
Subject:   Re: How best to do Jellyfish rollouts? (long)
Forum:   rec.games.backgammon
Google:   4k8njk$6lc@twix.cs.umd.edu

> I just bought the Jellyfish analyzer 2.0 and am trying to
> figure out the best way to perform rollouts.  Depending on how I set
> the variables I get very different and even conflicting results.

[ rolling out 2 plays:
  Play 1)  24/20/16* 13/9(2)
  Play 2)  24/20/16* 8/4(2)
  after an opening 4-1 played 13/9 6/5 ]

[ results so far:

Play 1) JF7 evaluation              .486
        Level 6 (36x)               .516
        Level 6 (36x)               .451
        Level 6 (106x)              .439
        Level 5 truncated (7776x)   .484

Play 2) JF7 evaluation              .461
        Level 6 (36x)               .413
        Level 6 (36x)               .559
        Level 6 (106x)              .530
        Level 5 truncated (7776x)   .491
]

> I don't like the idea of truncated rollouts because they rely
> heavily on JF's evaluation of the position.  If it is incorrectly
> evaluating the position then the results are not worth much.  It doesn't
> seem to evaluate backgames well and the above position easily turns into
> one.

Well, it's true that truncated rollouts rely on JF's evaluations,
but most of the time, and for most positions, this isn't much of
a problem.  This is because the errors in JF's evaluation will
in large part cancel out -- sometimes the evaluation will be too
high and other times too low.  And JF evaluations are really pretty
good.  Better than human evaluations, anyway.  Some error may remain
if the game tends to develop into positions in which there is some
bias in JF's evaluation.  By itself, this usually isn't too much
of a problem, because most positions tend to branch out into a wide
variety of types of positions, and the positions which don't, and
for which JF's evaluations are off, are often positions that you
can't trust JF with anyway.  If you review the rollouts of Robertie's
_Advanced_Backgammon_, you can get a good feeling for the amount
of error that typically arises from using truncated rollouts.

For the position in question, there should be very little trouble
with using truncated JF rollouts.  JF understands opening checker
play very well, and the game is likely to evolve into a wide
variety of different kinds of positions, so there should be
relatively little bias due to truncation.  I disagree that this
position will "easily" become a backgame.  The first player should
generally be very much trying to avoid this scenario, and will usually
succeed.  Certainly, with JF at the helm, this will very rarely
become a backgame.

The main advantages of truncated rollouts are two:
1) they are faster, and
2) they have lower variance.  That is, they converge toward the
   "infinite rollout" equity with fewer trials, on average.

Item two just means that you need fewer trials to get your
answer, so the advantage of truncated rollouts comes down
to just one thing, which is that they are faster.

The disadvantage of truncated rollouts is that sometimes they
are biased.  This is less of a problem in a checker play rollout
(which is also when speed is more of a concern), but very
important for cube rollouts.  But the more significant disadvantage
to truncated rollouts is that JF does not give you "live cube"
figures with truncated rollouts, which it does with non-truncated
rollouts.  This is obviously a problem when you are rolling out
a cube action problem, but also a factor in many checker play
problems (see, for example, Jeremy Bagai's excellent article in
the Jan-Feb Inside Backgammon, or the solution to Inside Backgammon
quiz problem #110).  For these reasons, I almost always do
complete rollouts, but truncated rollouts are not as suspect
as you think.

> I'm new to this rollout business and am not making much out of
> the above results.  I'm also starting to think that JF rollouts are
> way overrated.  I studied the JF rollouts of Robertie's Advanced
> Backgammon and I find Robertie's logic far more convincing than the
> rollouts in the vast majority of the problems.

Well, my guess is that you're overrating Robertie's logic.  The
fact is, most interesting backgammon problems cannot be tackled
by logic.  Over the board, we reason as best we can, but ultimately
we are just guessing based on our experience.  Robertie recognizes
this himself.  A few years back he sharply criticized a problem
solution by Kleinman (which was based on reasoning from general
principles), and backed up his criticism with (hand) rollouts.
Robertie wrote that backgammon was not "an exercise
in deductive logic" but rather, at least for correctly analyzing
positions, an exercise in empirical science.  Rollout data is
exactly what is needed to determine the correct play, most
of the time.

The fact is that many of Robertie's solutions are after-the-fact.
Long propositions were played, and Robertie learned the result
and saved the position.  In his book, he justifies the solution
based on logic or reasoning or breaking down the rolls or
emphasizing one very important feature of the position.  In doing
this, he is showing the reader how one might approach the problem
over the board, which is exactly what you want to know to play
better backgammon.  But the important thing to realize is that
the empirical data came first, and the reasoning to point you to
the correct play is derivative.  Kit Woolsey has also often
emphasized this point, by saying how he has learned a lot from
trying to figure out rollout results which at first seemed unintuitive.

Now, as to whether JF rollouts are overrated -- I guess it depends
on the person and the position.  JF rollouts are a tremendous source
of empirical data for a wide variety of positions.  But they do
have their limitations.  First of all, any rollout is subject to
statistical variation.  So when results come out very close, there
is very good reason to be skeptical about the results' significance.
JF gives the standard deviations of the rollouts it performs, so
this can be a guide for that.

Secondly, any position can be misplayed.  Putting aside for the
moment major thematic errors, small mistakes can be made favoring
one side or the other, and these small mistakes should add a little
more doubt to the significance of close results, even in positions
that we believe JF handles well.

Now, turning to the question of thematic errors, its well documented
that JF has a few of these.  Here are the ones that come to mind
right now:

- JF gets low results with outside primes -- the further outside,
  the more irrelevant the results.  JF doesn't completely understand
  how to walk a prime home against a single trapped checker.
- JF doesn't understand well how and when to try for a second checker
  after a bearoff hit.
- JF gets high results in many backgames.  However, I think this
  bias has been overemphasized.  In backgames nearing resolution,
  where the timing issue has been resolved, as is the case in many
  forward (e.g., 34 or 45) backgames, JF's results are not that
  far off.  In these cases, JF may give up a little due to having
  to walk its prime home after a hit, but probably not much.  In
  deeper backgames, JF gives up more because capturing a second checker
  may be a significant consideration.  Also, JF doesn't always
  understand when to split its rear checkers to generate more shots.
  In positions where the timing issue is not yet resolved, or where
  there is still significant forward equity, as in a two-way game,
  JF *may* give up significant equity because it often will avoid
  the backgame strategy that a human would choose.  I emphasize may
  because I think JF is often right in avoiding the backgame, and that
  human players are often wrong about this.  JF probably gives up
  the most in well-timed deep backgames where the leader is still
  a long ways from the bearin.
- JF can get weird results in noncontact positions.  This problem
  has probably been reduced by the bearoff database in JF2.0, but
  JF still isn't the best tool for these kinds of positions.
- JF gets low results for many priming positions against one back,
  even when the prime is deep in the board.  This is especially
  true when slotting the back of the prime is important.  JF very
  often doesn't do this when it is correct.
- Wilcox Snellings thinks JF gets high results vs deep anchor
  games, especially vs ace point games.  I don't know whether this
  is true or not, but it's plausible.  Part of the equity of acepoint
  games comes from capturing a second checker after a late hit.
- JF can get results that are off in what I call "runaround" positions.
  These are positions where one side is trying to navigate the last
  few checkers around the opposition.  An example is: side A has
  4 checkers each on the 1, 2, and 3 points, and 1 checker each on
  the 4, 17 and 18 points; side B has a closed board, and 1 checker each
  on the 18, 19, and 20 points.  JF doesn't count shots, so sometimes
  it makes significant checker play errors when rolling these positions
  out.
- JF gets low results in bearoff hit positions in which there is
  a lot of play.  For example,

  X O O . X . | | . . . . X . [2]
    O O       | |
    O O       | |
    O O       | |
              | |
        X   X | | X X X
  . . O X O X | | X X X . X X

  X's home board.  O has 5 off

  With O owning a 2-cube, X's equity is about 0.70.  JF gets
  .261 cubeless, .345 after doubling to 2 (3888 trials).
  Interesting, humans tend to overrate the value of these
  positions.
- many purely technical decisions are less amenable to rollouts,
  whether by JF or humans.  This is especially true if the technical
  decision tends to repeat itself.
- because of the way JF uses the cube in live cube rollouts, sometimes
  its cube numbers are way off.  A common example is a position where
  the trailer has a busted board, one checker back at the edge of a
  five prime, and the leader has checkers back in the trailer's home
  board.  In this situation, the trailer may leap the prime and obtain
  a double-in (which JF doesn't recognize), only to obtain a huge
  cash one roll later.  In general, if the trailer has only one common
  recube variation, and this variation yields mostly weak doubles-in,
  JF's live cube algorithm will not give accurate results.
- Another common live cube error ends up with the cube owner doing
  *worse* owning the cube.  Apparently when this happens JF has
  erroneously played on for the gammon some of the time.

So yes, JF rollouts cannot be trusted implicitly.  However, for
most positions JF rollouts are the best source for equities, and
considered carefully, the best tool for improving your game.

An interesting corrolary to the fact that JF misplays the above
situations, is that JF plays other types of positions *better*
than a human of overall equivalent strength would.  This shows
up most prominently in the play of attacking positions, where
JF frequently gets results that are higher than humans get.

> I would appreciate some advice from those more experienced
> with rollouts as to how better utilize the program.  What paramaters
> work best for the above rollout?

Here's my advice:
- Always do rollouts in multiples of 36 (unlike the 106 game rollout)
  and in multiples of 1296 if doing level 5 rollouts.
- If you have time and a fast enough computer, do complete rollouts.
  This way you avoid any bias and get the cube numbers as well.
- When doing checker play rollouts, set the seed the same for all
  the plays under consideration.
- Don't regard checker play results that are within 2 standard
  deviations as anything significant.  If you don't want to bother
  to look at the standard deviations, as a rule of thumb, consider
  differences of .10 significant for rollouts of 1296, .07 for
  2592, .06 for 3888.
- There are decreasing returns as you roll positions out more times.
  You will reduce the standard deviation, but if the equities are
  still close, the errors in checker play are probably more significant
  than the random error.  I usually don't roll plays out more than
  3888 times.
- When rolling out checker plays, go ahead and roll out all those
  plays that fit the themes of the position, even if you don't think
  they are candidate plays.  Occasionally one of these plays you
  didn't like will actually turn out to be best, and you'll learn
  something.  If you're short on time, do small short truncated
  rollouts first to identify the real candidates.
- Look at both the cube numbers and the cubeless numbers.
- If you really want to understand what's going on in a cube action
  situation, rollout several variations of the position, so that
  you can see how they affect the equity.  Use the same seed for
  all of these rollouts.
- DON'T just believe the rollout results as though they came
  from on high.  But try to understand the sorts of positions
  where the results are off, and why, so that you can know
  when you can trust JF and when you should be skeptical, and
  the probable direction of the error.
- If you suspect that the rollout is biased, you can look at how
  JF plays the first numbers, or set up a few important variations
  to see how it plays those.  You may find that with level 6 it
  does a better job, in which case use that.  If it still seems
  to be playing the position wrong on level 6, use the interactive
  rollout feature.  One approach would be to play it 36x with
  you playing one side, JF level 6 the other, and then another
  36x with you playing the other side and JF level 6 the first.
  If you're right that JF is screwing the position up (and you're
  not), you'll see it in the results.
- Be careful about interpreting the rollout results for a
  particular match score.  JF does all its rollouts based on
  choosing the best cubeless plays, with gammons and backgammons
  counting (and counting equally for both sides), so the results
  may not be valid in a match situation.  For many match scores,
  there is no satisfactory way to set the JF cashing parameter
  to give a reasonable match live cube rollout, so you are better
  off interpreting the cubeless numbers.

My experience is mostly with using JF level 5 rollouts, but it
may well be better to use JF level 6 by default.  It certainly
plays better on level 6, and with JF's variance reduction algorithm,
its not a lot slower, effectively.

Hope this is of some use to you,
David Montgomery
monty on FIBS
 
Did you find the information in this article useful?          

Do you have any comments you'd like to add?     

 

Rollouts

Advice  (David Montgomery, Apr 1996)  [Long message]
Cautionary tale  (Kit Woolsey, Sept 1995) 
Combining rollouts  (Gregg Cattanach+, Dec 2003)  [GammOnLine forum]
Confidence intervals  (Bob Koca, Nov 2010) 
Confidence intervals  (Timothy Chow, May 2010) 
Confidence intervals  (Gerry Tesauro, Feb 1994) 
Cubeless vs centered-cube rollouts  (Ron Karr, Dec 1997) 
Duplicate dice  (David Montgomery, June 1998) 
How reliable are rollouts?  (David Montgomery, Aug 1999) 
Level-5 versus level-6 rollouts  (Michael J. Zehr, June 1998) 
Level-5 versus level-6 rollouts  (Chuck Bower, Aug 1997) 
Positions with inaccurate rollouts  (Douglas Zare, Oct 2002) 
Reporting results of rollouts  (David Montgomery, June 1995) 
Rollout settings  (Lokicol+, Apr 2010) 
Settlement limit  (Michael J. Zehr, Apr 1998) 
Settlement limit  (Kit Woolsey, Dec 1997) 
Settlement limit in races  (Alexander Nitschke, Dec 1997) 
Some guidelines  (Kit Woolsey, Apr 1996) 
Standard error and JSD  (rambiz+, Feb 2011) 
Standard error and JSD  (Stick+, Oct 2007) 
Systematic error  (Chuck Bower, Oct 1996) 
Tips for doing rollouts  (Douglas Zare, June 2002) 
Truncated rollouts  (Gregg Cattanach, Oct 2002) 
Truncated rollouts: pros and cons  (Jason Lee+, Jan 2006)  [GammOnLine forum]
What is a rollout?  (Gregg Cattanach, Dec 1999) 

[GammOnLine forum]  From GammOnLine       [Long message]  Long message       [Recommended reading]  Recommended reading       [Recent addition]  Recent addition
 

  Book Suggestions
Books
Cheating
Chouettes
Computer Dice
Cube Handling
Cube Handling in Races
Equipment
Etiquette
Extreme Gammon
Fun and frustration
GNU Backgammon
History
Jellyfish
Learning
Luck versus Skill
Magazines & E-zines
Match Archives
Match Equities
Match Play
Match Play at 2-away/2-away
Miscellaneous
Opening Rolls
Pip Counting
Play Sites
Probability and Statistics
Programming
Propositions
Puzzles
Ratings
Rollouts
Rules
Rulings
Snowie
Software
Source Code
Strategy--Backgames
Strategy--Bearing Off
Strategy--Checker play
Terminology
Theory
Tournaments
Uncategorized
Variations

 

Return to:  Backgammon Galore : Forum Archive Main Page