Forum Archive : Ratings

Emperical analysis

From:   Gary Wong
Address:   gary@cs.arizona.edu
Date:   20 October 1998
Subject:   Re: FIBS formula question/comment
Forum:   rec.games.backgammon
Google:   wtbtn79n90.fsf@brigantine.CS.Arizona.EDU

hankyoungerman@home.com (Hank Youngerman) writes:
> I would like to see some emprical analysis of the relative likelihood
> of winning and losing - but other than that - the formula seems pretty
> sensible to me.

Careful what you wish for, you just might get it :-)

I performed an experiment testing these three hypotheses:

 * That the basic Elo system could accurately describe the distribution
   of backgammon game results, if matches were all of the same length.

 * That the FIBS implementation of the Elo system accurately describes
   the distribution of backgammon game results, for matches of all

 * That the FIBS implementation of the Elo system systematically
   overestimates the underdog's chances in shorter than average matches,
   and overestimates the favourite's chances in longer than average

The details are written up at:


but the bottom line is that my data (several thousand one-pointers that
Abbott played against opponents ranging from 1280-1880) showed no evidence
refuting _any_ of the three hypotheses above.  I was going to go into
more detail (graphs of observed winning probabilities vs. FIBS-predicted
winning probabilities vs. best-fit Elo predicted winning probabilities
specifically fitted to the match length, etc.) but once I got as far as
what's on that web page, I'd pretty much satisfied my own curiosity so I
think it's pointless going any further.

(The one thing I would be interested in is performing the same kind of
analysis on longer length matches, if anybody has data available.
Unfortunately the chi-squared test I used needs LOTS of data; preferably
over 1000 matches of identical length.  If they all involve the same
player, that's even better; if that player is a bot and so we know their
ability didn't change while the data were being collected, that's better

My current opinion is that the Elo system can be a very good predictor
of backgammon game distributions; that the FIBS implementation (with
the ratings difference scaled by the square root of the match length)
is flawed but adequate; and that FIBS systematically overestimates
the underdog's chances in short matches and overestimates the favourite's
chances in long matches.  As I said, I've satisfied my own curiosity and
am now going to shut my mouth on this topic (bet you thought you'd never
hear me say that :-)

        Gary Wong, Department of Computer Science, University of Arizona
            gary@cs.arizona.edu     http://www.cs.arizona.edu/~gary/
Did you find the information in this article useful?          

Do you have any comments you'd like to add?     



Constructing a ratings system  (Matti Rinta-Nikkola, Dec 1998) 
Converting to points-per-game  (David Montgomery, Aug 1998)  [Recommended reading]
Cube error rates  (Joe Russell+, July 2009)  [Long message]
Different length matches  (Jim Williams+, Oct 1998) 
Different length matches  (Tom Keith, May 1998)  [Recommended reading]
ELO system  (seeker, Nov 1995) 
Effect of droppers on ratings  (Gary Wong+, Feb 1998) 
Emperical analysis  (Gary Wong, Oct 1998) 
Error rates  (David Levy, July 2009) 
Experience required for accurate rating  (Jon Brown+, Nov 2002) 
FIBS rating distribution  (Gary Wong, Nov 2000) 
FIBS rating formula  (Patti Beadles, Dec 2003) 
FIBS vs. GamesGrid ratings  (Raccoon+, Mar 2006)  [GammOnLine forum]
Fastest way to improve your rating  (Backgammon Man+, May 2004) 
Field size and ratings spread  (Daniel Murphy+, June 2000)  [Long message]
Improving the rating system  (Matti Rinta-Nikkola, Nov 2000)  [Long message]
KG rating list  (Daniel Murphy, Feb 2006)  [GammOnLine forum]
KG rating list  (Tapio Palmroth, Oct 2002) 
MSN Zone ratings flaw  (Hank Youngerman, May 2004) 
No limit to ratings  (David desJardins+, Dec 1998) 
On different sites  (Bob Newell+, Apr 2004) 
Opponent's strength  (William Hill+, Apr 1998) 
Possible adjustments  (Christopher Yep+, Oct 1998) 
Rating versus error rate  (Douglas Zare, July 2006)  [GammOnLine forum]
Ratings and rankings  (Chuck Bower, Dec 1997)  [Long message]
Ratings and rankings  (Jim Wallace, Nov 1997) 
Ratings on Gamesgrid  (Gregg Cattanach, Dec 2001) 
Ratings variation  (Kevin Bastian+, Feb 1999) 
Ratings variation  (FLMaster39+, Aug 1997) 
Ratings variation  (Ed Rybak+, Sept 1994) 
Strange behavior with large rating difference  (Ron Karr, May 1996) 
Table of ratings changes  (Patti Beadles, Aug 1994) 
Table of win rates  (William C. Bitting, Aug 1995) 
Unbounded rating theorem  (David desJardins+, Dec 1998) 
What are rating points?  (Lou Poppler, Apr 1995) 
Why high ratings for one-point matches?  (David Montgomery, Sept 1995) 

[GammOnLine forum]  From GammOnLine       [Long message]  Long message       [Recommended reading]  Recommended reading       [Recent addition]  Recent addition

  Book Suggestions
Computer Dice
Cube Handling
Cube Handling in Races
Extreme Gammon
Fun and frustration
GNU Backgammon
Luck versus Skill
Magazines & E-zines
Match Archives
Match Equities
Match Play
Match Play at 2-away/2-away
Opening Rolls
Pip Counting
Play Sites
Probability and Statistics
Source Code
Strategy--Bearing Off
Strategy--Checker play


Return to:  Backgammon Galore : Forum Archive Main Page