|
hankyoungerman@home.com (Hank Youngerman) writes:
> I would like to see some emprical analysis of the relative likelihood
> of winning and losing - but other than that - the formula seems pretty
> sensible to me.
Careful what you wish for, you just might get it :-)
I performed an experiment testing these three hypotheses:
* That the basic Elo system could accurately describe the distribution
of backgammon game results, if matches were all of the same length.
* That the FIBS implementation of the Elo system accurately describes
the distribution of backgammon game results, for matches of all
lengths.
* That the FIBS implementation of the Elo system systematically
overestimates the underdog's chances in shorter than average matches,
and overestimates the favourite's chances in longer than average
matches.
The details are written up at:
http://www.cs.arizona.edu/~gary/backgammon/elo.html
but the bottom line is that my data (several thousand one-pointers that
Abbott played against opponents ranging from 1280-1880) showed no evidence
refuting _any_ of the three hypotheses above. I was going to go into
more detail (graphs of observed winning probabilities vs. FIBS-predicted
winning probabilities vs. best-fit Elo predicted winning probabilities
specifically fitted to the match length, etc.) but once I got as far as
what's on that web page, I'd pretty much satisfied my own curiosity so I
think it's pointless going any further.
(The one thing I would be interested in is performing the same kind of
analysis on longer length matches, if anybody has data available.
Unfortunately the chi-squared test I used needs LOTS of data; preferably
over 1000 matches of identical length. If they all involve the same
player, that's even better; if that player is a bot and so we know their
ability didn't change while the data were being collected, that's better
still.)
My current opinion is that the Elo system can be a very good predictor
of backgammon game distributions; that the FIBS implementation (with
the ratings difference scaled by the square root of the match length)
is flawed but adequate; and that FIBS systematically overestimates
the underdog's chances in short matches and overestimates the favourite's
chances in long matches. As I said, I've satisfied my own curiosity and
am now going to shut my mouth on this topic (bet you thought you'd never
hear me say that :-)
Cheers,
Gary.
--
Gary Wong, Department of Computer Science, University of Arizona
gary@cs.arizona.edu http://www.cs.arizona.edu/~gary/
|