Java: BG-Blitz--How strong?

From:   montygram
Date:   28 June 2005
Subject:   Re: BGBlitz 1.9.4 is online....

Would you know how BGBlitz compares to the free Jellyfish and GNU programs,
in terms of strength?


Frank Berger  writes:

I cite Jim Segrave (from r.g.b):
> It's somewhere in the same playing strength as gnubg or Snowie 4 --
> no-one's played enough matches between either of those and BGblitz
> to get an exact measure of the difference in strength, but unless
> you are an incredibly strong player, it won't make any difference.
> (I'm not associated with BGblitz, I'm a gnubg developer, but I
> bought a copy and have been thouroughly satisfied with all aspects
> of it).

The problem is that you have to play a huge number of games to
measure the playing strength, and this takes more time than anybody was
willing to spend in the past. Now I tell you what I believe:

BGBlitz checkerplay is better than Jellyfish and on par with gnubg (no
idea about Snowie, I had no opportunity to test). The cubeplay may be
a little better in gnubg. With this release I added cubeful equities
for matchplay (plugging the last weakness) and the function that tries
to estimate the cube lifeness might need some tuning, although it
seems to be robust so far. At least it solves all bad cube decision
from the Graz matches against gnubg correctly.

I recently made 100 13pt matches Jelly against BGBlitz on their
highest setting(score 50-50, took about a week with Dueller) to test
the new cube algorithm and fed the first 40 matches to GNU for
analysis (gnubg 2-ply). The result was that BGBlitz has a little bit
higher rating than JF (about 10 points or so). The absolute value
(around 2030) is only a rough indicator. I checked some moves/cube
decisions that gnubg has flagged as error or even severe error, to
find out that GNU agrees on 3-ply or 4-ply for about a third of the
positions with BGBlitz choices and that the "error" is reduced on a
large percentage of other positions with higher plies, so the "real"
rating is most probably higher by a decent number.

Further I don't believe that there is a X is stronger than Y relation
in a mathmatical sense, but the best bot varies according to the
position.  E.g. I believe that BGBlitz is more robust in unusual
positions (see the terrible backgame of GNU that was posted here some
month ago. Another indicator is that BGBlitz has far smaller odd-even
effect) than gnubg, but there are several types of positions, where I
believe GNU's understanding is better (e.g. some types of running
games). I too have positions (but only very few, mostly from books)
where Snowie blunders. In some positions BGBlitz chooses the right move
even on 1-ply on others BGBlitz was even further off than Snowie, so
here again there is no strict X > Y relation.

Having plugged the ugly cubing weakness I will put my focus on the
checkerplay in the next few months (in fact the current net is the same
as 1.0, just that it has got a bigger hidden layer with v1.7). I have
some ideas for interisting inputs and some ideas for imroved learning
methods and I'm very confident that I can squeeze out another 1-2%.

So although I couldn't give you an easy answer, I hope that this post
helps you a little.

