Forum Archive :
Programming
Maverick writes:
> There have been some postings about whether:
> A:Jellyfish plays backgames well
> B:Whether Jelly plays them better than Snowie
>
> There seems to be a general (I think erroneous) opinion that
> JF doesn't play backgames well.
> Well I'd like to challenge those guys who make this claim to come up
> with some evidence. I play an even game with JF Level 7 (Tested over
> 10 sets of 100 games) and although backgames don't occur frequently,
> when they do I find JF has no problem defending against them at all.
> It certainly knows how to bust my timing and certainly knows how to
> recirculate checkers and capture additional ones as I have found out
> to my cost.
There is a big difference in the strategies of defending against a
backgame, and playing a backgame. In general programs (JF, SW, EXBG)
have been critized much more for their play of the backgame side, not
for how they play in defending against a backgame. When someone says that
a program doesn't play backgames well, they are almost certainly
talking about playing the backgame side, as opposed to the defender
side.
Backgames come in a multitude of varieties, and the appropriate
strategies can differ greatly. I think one of the most poorly understood
classes of positions consists of incipient/potential backgames in which
the trailer has some small chance of going forward, but generally has
a poor game, and in which the timing issue is undecided. I think these
positions are probably played poorly by everyone, top humans and bots
included. The bots mostly play these positions by adamantly refusing
to play a backgame strategy. Assuming they play backgames poorly, this
makes sense. Humans are much more likely to go into 'backgame mode'
too readily, giving up substantial equity. Until we have stronger
programs, I don't think we will be able to accurately resolve many
of the issues arising in these sorts of positions.
If we look at positions where the timing issue is resolved, where the
defender is bearing in or bearing off, the game becomes much clearer and
with the existing tools we can learn a lot about the right decisions. Here
the issues are primarily: How likely is the leader to win a gammon, when
hit and when not hit? How likely is the leader to win a backgammon? How
often will the trailer hit? How many checkers will the leader have off
when hit? How strong will the backgame player's offense be after a hit?
Can the backgame player capture a second checker?
There is certainly one area of play where the bots are indisputably
worse than top humans right now. This is when a side gets hit bearing
off, has many checkers borne off (say, 10-13), and the hitting
side has a strong blocking structure and can try for a second checker.
This is a fairly obscure patch of the backgammon universe, but it is
a significant part of the equity a backgame player has. To the degree
that this is a significant variation, the bots will do worse in backgames.
Mostly this comes up in well-timed deep backgames.
In two other areas the bots seems to be a little worse than people in
backgames. First, in arranging their checkers to get a hit (and avoid
the gammon and backgammon). The backgame player often must break one
or more anchors to get the best chance for a hit (or an early, effective
hit). Second, in bringing home the win after a hit. This often involves
slotting parts of a prime or rolling a prime home. The bots aren't too
bad at these things relative to people (in typical positions), but they
do seem a little worse.
> I don't own a copy of Snowie but if you do, how about importing some
> backgame matches Jelly has played and seeing if Snowie can indeed
> spot some blunders.
It is very hard to find games where JF opts to play a deep backgame.
JF has learned not to play them.
> Where some disagreements may be occuring on JF's playing of backgames
> are probably resulting from comparing rollouts to actual games.
> As the rollouts are limited to either L5 L6 the data may not reflect
> JF's true strength at playing back games as is plays a MUCH stronger
> game at L7 which it may "need" in order to play backgames properly.
Although JF probably plays backgames better on level 7 than on level 6,
I don't think that is the main problem. One way of looking at why
neural nets so easily get so good at backgammon is to realize that for
most positions, most of the time, backgammon is easy. The things that
are good are good, and the things that are bad are bad. Opponent on
the roof, good. Me on roof, bad. Opp leaves shots, good. Me leaving
shots, bad. My points are good, the opponents are bad. Being ahead in
the race is good. Having a strong board is good. Having a strong
blockade is good. Having many checkers back is bad. Having checkers
flexibly placed (say, 3 to a point) is good -- having the stacked up
(5+ to a point) is bad.
These things are almost always true. All you have to do is to figure
out the proper weighting for the things -- how important is this good
thing vs. that good thing? And the nets are very good at learning from
experience how to properly weight these things, so they never (in
"normal" positions) weight these things drastically wrong (unlike
people, who do).
The positions where the programs have had the most trouble is in
positions where the things that are normally good aren't good any
more. Or the things that are normally bad are now good. For
example, say you're trying to walk a prime home from the opponent's
outfield 18-13. Normally, the best point on the board is the 6 point.
But here it would be a huge mistake to make the 6 point (say, with
44, 55, or 66) -- the most important point right now is the 12 point,
which is a point usually worth little.
Usually, closing out your opponent is good, especially if you're not
primed. But this is often not the case if your opponent has 10 checkers off
and you might be able to capture a second checker. In fact, it can be
right to take a point in your prime or strong board and turn it into two
blots in order to prevent your opponent from safetying a second checker!
This kind of play is rarely right normally.
Similarly, in backgames, it may not be bad at all to have many checkers
sent back, if you have timing. Being behind in the race is part of the
strength of the position. Having checkers on the roof can be helpful.
Having a strong inner board is almost worthless if its a long time before
you can hit effectively. So all the things that a net has learned are
good are not necessarily good at all here.
This problem is remarkably minor in backgammon, which is one reason that it
has been easy for programs to learn good evaluation functions. In chess,
you can't say that having a certain piece on a certain square is good or
bad in itself -- at all! In backgammon, 99% of the time, owning the six
point is a good thing, until you are legally obligated to abandon it.
Thus I don't think adding another ply of lookahead is as big a deal.
If the overall understanding of the priorities is wrong, then weighting
your assumed priorities more accurately won't help much.
There are workarounds for these problems, of course. JF 3.0 has some
added functionality that helps it roll home distant primes. Snowie (beta)
doesn't have this yet. JF 3 plays backgames much more like people than
earlier versions, and I would guess that it has a different evaluation
function being used in many backgame situations. As more and better
workarounds are added, the bots will eventually play even these "weird"
positions much better.
> There's already a lot of hullabaloo about what Snowie will cost and
> whether it will actually play a better allround game than Jelly.
> My guess is it will play some positions better and some worse.
True.
> I'm not familiar with neural net technology but if you are, could you
> explain to me whether if a neural nets inputs are being tweaked in one
> area to improve its game whether it affects another part of its game
> even if the sum of the two changes still makes the net overall better
> ?
Yes, if you are using the same evaluation function everywhere, this will
generally be true. For example, in some experiments people tried training
not just from the opening position, but also from positions with more
checkers back, more potential backgames. In one case they trained from the
Nackgammon position. The programs then played better in these sorts of
positions, but the overall play decreased. Actually, this isn't the same
as changing the inputs, as you mention, but it is similar. If you change
the inputs, and then train one evaluation function to be used everywhere,
the evaluations are likely to be changed (at least somewhat) in lots of
areas of the game, not just the one you intended to change.
David Montgomery
monty@cs.umd.edu
monty on FIBS
|
|
David Montgomery writes:
I wrote:
> > There is certainly one area of play where the bots are indisputably
> > worse than top humans right now. This is when a side gets hit bearing
> > off, has many checkers borne off (say, 10-13), and the hitting
> > side has a strong blocking structure and can try for a second checker.
> > This is a fairly obscure patch of the backgammon universe, but it is
> > a significant part of the equity a backgame player has. To the degree
> > that this is a significant variation, the bots will do worse in
> > backgames. Mostly this comes up in well-timed deep backgames.
maverick wrote:
> Sorry but I have to disagree. I played agame only the other day with
> 12 checkers off and JF had only the 2 3 4 an 6 and 8 points made.I
> entered on the ace but unable to move the second part of the roll. It
> subsequently slotted both the 5 and 7 points with its next play, I was
> forced to hit and eventually it made a prime followed by some
> reciculation and a closeout. It played this game as far as I could see
> perfectly on L7.
> Could you give me some evidence where JF doesnt play this position
> correctly?
If you post the exact position I will try to do so -- assuming that it
does play incorrectly -- I don't know since I haven't seen the position.
(If both of your unhit checkers were on your ace point, so that capturing
a second checker is virtually impossible, then JF plays the position
quite well.)
Later in this post I'll show you a position that I play better than the
bots (although I figure that I'm actually botching it). First, let me
explain one way (not the only way) to figure out that the computer
players are playing worse, in some types of positions.
You are certainly right that it can be difficult to figure out whether
the expert or the program is playing the backgame better, since
there might be something else going on, like the expert playing the other
side *worse*. However, in some positions it seems clear that one
side has the more difficult checker play. To take an extreme example,
let's say that one side has only 1 checker left on the board, and the
other side is trying to contain this checker and then win. Without a
doubt all the skill lies with the 15 checker side. If program A can
win more with the 15 checker side than program B, we can conclude that
program A plays this position better.
Here's an example where both sides have 15 checkers on the board,
but clearly all the skill is for one side:
|===========================================| 30
| O O ... X X | | X X X O ... |
| O O ... X X | | X X X ... |
| O O ... X X | | X X X ... |
| O O . . | | . . . |
| O O . . | | . . . |
| O O | | |
| O | | |
| O | | | X on roll
| | | |
| | | |
| . . . | | . . . |
| . . . | | . . . |
| ... ... ... | | ... ... ... |
| ... ... ... | | ... ... ... | [2]
| ... ... ... | | ... ... ... |
|===========================================| 270
With good human play, X can cash. With JF v1 and v2, O has a
monster beaver, followed by a cash regardless of X's roll (hmmm...
or maybe O is supposed to play on -- I don't remember). With JF
v3 handling the checker play (but not the cube) then X is close
to a double, but the take is easy. Could JF v3 really play O that
much better than people? And v2 so much more so? No. JF screws
this up. People play it better.
In developing backgames/potential backgames, there is great potential
for skillful play on both sides, and so it can be very hard to figure
out who is getting it right and who is getting it wrong.
However, once the backgame side is committed to bearing in (recirculation
is over), perhaps even bearing off, then the timing issues have been
resolved and most of the opportunity for skillful play lies with the
backgame side. The player bearing in will have some decisions on
how to arrange spares, and properly balancing long and short term safety
vs. ripping checkers. These are important decisions, but on the other
hand, many of the backgame defender's plays will be forced. The backgame
player will also have some choices in arranging checkers to get a hit,
though fewer, but these plays can be quite significant and the programs
often get them wrong (trivial example: JF sometimes doesn't vacate 24
when opponent has only 2 point stack). After a hit, much more skill
is required of the backgame player than the defender. The hitter must
contain the hit checker, perhaps slot and build a prime, walk the prime
home, perhaps try for a second checker, and then bear off in a position
where speed may be as important as safety. Here is an example:
|===========================================| 180
| ... ... O O O | | O ... ... |
| ... ... O O O | | O ... ... | [2]
| ... ... ... | | ... ... ... |
| . . . | | . . . |
| . . . | | . . . |
| | | | X on Roll
| | | |
| . . . | | . . . |
| . X X . | | . . . |
| ... X X ... | | ... ... ... |
| O O X X X | | ... ... ... |
| O O X X X | | ... O O O |
|===========================================| 40 X off: 5
In these kinds of positions, the majority of the skill is on the side
of the backgame player, and so if program A gets a better result than
program B for the backgame side, I believe program A is playing it better.
For example, JF v3 gets a lower result for X here than JF v2. Therefore
I believe that v3 plays O better.
Within reason, of course. If the results aren't that far apart, we might
want to consider whether program B is playing the defender better, and so
forth.
Finally, there are positions with extensive skill required of both
sides, but which nonetheless seem to require disproportionate skill of
one side or the other. Here's an example:
|===========================================| 253
| ... ... O O | | ... ... ... |
| ... ... O O | | ... ... ... |
| ... ... ... | | ... ... ... |
| . . . | | . . . |
| . . . | | . . . |
| | | | [1]
| X | | |
| . X X | | . . . | X on Roll
| . X O X | | . . . |
| ... X O X | | ... ... ... |
| O O O X O X | | X X ... ... |
| O O O X O X | | X X ... O |
|===========================================| 86
I have played this a lot as a prop, and I think it's not too hard to win
against a weaker player from both sides, because while the backgame player
is a favorite with good play, he is a dog with average play. The defender
here does have some tough decisions on when to hit inside to try to
start a 3rd point, risking the hit of a second checker, but based on
my experience playing it, I'm sure it's harder to play O accurately.
JF v3 gets a better result for O than does JF v2, so I believe that
JF v3 plays it better than JF v2.
Okay, here is a position of the type I meant in the first paragraph:
+24-23-22-21-20-19-+---+18-17-16-15-14-13-+
13| O | | X |
O| | | |
O| | | |
O| | | |
O| | | |
O| | O | | O on Roll
| | | |
| | | |
| X | | |
| X | | |
| X X X | | X |
| X X X X | | X X X X |
+-1--2--3--4--5--6-+---+-7--8--9-10-11-12-+
This position is a follow-up to one that was posted here in April.
The original post was "stay on the 1pt against 3 on 2pt without
a prime", by Harald Retter. Here X stayed, playing 66 18/12 14/8
13/7 9/3, O rolled 61 playing 2/off 2/1*, and X hit with a 62 played
bar/23*/17, producing the position above.
I wasn't sure what the cube action was here, but I knew that I would
take over the board. Kit Woolsey wrote regarding this variation:
KW> 2) O rolls something-ace, and X hits with a two. O probably has
KW> a play-on for a bit, but it could get treacherous fast.
I was very surprised by this comment, Kit suggesting a play-on when I
was taking, so I set up the above continuation. (Note that 62 is probably
X's weakest hitting number.) I rolled it out, manually (using JF's
interactive rollout feature) and with JF v3 L6.
I don't have my results handy, but I know that I got a *much* lower
result for O than did JF. Since almost all of the skill here is for
X, I think that is proof that I play this position better than JF.
JF doesn't understand how important it is for X to capture the second
checker. Arranging ways to make this happen often takes precedence
over making the best play to contain the first checker.
I don't think I played the position particularly well, btw. My
recollection is that O should not double, and that X does not quite
have a beaver in the above position. I believe if I had the time
and motivation to study this position enough (say, if someone offered
me a long proposition contract) then I could play X well enough to
make it a beaver. With JF playing X, O should probably double, although
X's take is easy.
David Montgomery
monty@cs.umd.edu
monty on FIBS
|
|
|
|
Programming
- Adjusting to a weaker opponent (Brian Sheppard, July 1997)
- Anticomputer positions (Bill Taylor+, June 1998)
- BKG 9.8 vs. Villa (Raccoon+, Aug 2006)
- BKG 9.8 vs. Villa (Andreas Schneider, June 1992)
- BKG beats world champion (Marty Storer, Sept 1991)
- Backgames (David Montgomery+, June 1998)
- Blockading feature (Sam Pottle+, Feb 1999)
- Board encoding for neural network (Brian Sheppard, Feb 1997)
- Bot weaknesses (Douglas Zare, Mar 2003)
- Building and training a neural-net player (Brian Sheppard, Aug 1998)
- How to count plies? (Chuck Bower+, Jan 2004)
- How to count plies? (tanglebear+, Mar 2003)
- Ideas for improving computer play (David Montgomery, Feb 1994)
- Ideas on computer players (Brian Sheppard, Feb 1997)
- Introduction (Gareth McCaughan, Oct 1994)
- Measuring Difficulty (John Robson+, Feb 2005)
- Methods of encoding positions (Gary Wong, Jan 2001)
- N-ply algorithm (eXtreme Gammon, Jan 2011)
- Neural net questions (Brian Sheppard, Mar 1999)
- Pruning the list of moves (David Montgomery+, Feb 1994)
- Search in Trees with Chance Nodes (Thomas Hauk, Feb 2004)
- Source code (Gary Wong, Dec 1999)
- TD-Gammon vs. Robertie (David Escoffery, June 1992)
- Training for different gammon values (Gerry Tesauro, Feb 1996)
- Training neural nets (Walter Trice, Nov 2000)
- Variance reduction in races (David Montgomery+, Dec 1998)
- Variance reduction of rollouts (Michael J. Zehr+, Aug 1998)
- Variance reduction of rollouts (Jim Williams, June 1997)
- What is a "neural net"? (Gary Wong, Oct 1998)
- Writing a backgammon program (Gary Wong, Jan 1999)
From GammOnLine
Long message
Recommended reading
Recent addition
|
| |
|