Ratings

Forum Archive : Ratings

 From: Christopher Yep Address: yep.2@osu.edu Date: 10 October 1998 Subject: Re: fibs ratingformula Forum: rec.games.backgammon Google: yep.2.23.361EE00C@osu.edu

Hank Youngerman wrote: [Paraphrasing]
> What is the effect of the "2000" in "SQR(n) * (rating difference) / 2000"
> (from the FIBS rating formula)?  Is it empirically derived?

2000 is actually a constant (call it c).  If c is changed to 2000 * t (with
t > 0), the average rating will remain the same (approx. 1500).  Ratings
which differ from the population average will be scaled away from the
average by a factor of t, i.e. new_rating =  avg_rating + t * (old_rating
- avg_rating).

e.g. if avg_rating = 1500, c= 400 (i.e. t = 0.2), then

new_rating = 1500 = 0.2 * (old_rating - 1500)

e.g. 2000 under the old ratings will correspond to 1600 under the new
ratings.  1000 under the old ratings will correspond to 1400 under the
new ratings.  Old ratings between 1000 and 2000 have an equivalent new
rating between 1400 and 1600, which can be obtained by linear
interpolation.

Thus, the choice of c = 2000 only affects the spread of the ratings, while
not changing the ordering of "true" ratings.  i.e. consider two players
(player1 and player2): if player1_true_rating > player2_true_rating for c =
2000 then player1_true_rating > player2_true_rating for any other c > 0
(and vice-versa).

Note that the above discussion is referring to one's "true" rating.  For
small values of c there is a large amount of noise in the ratings system
(since the adjustment factor , 4 * K * SQRT (n) * P, is independent of c),
i.e. ratings will move relatively more quickly with a small c than with a
large c.  E.g. in an extreme case, if c = 1, then the "true" ratings would
likely range from 1499.75 to 1500.25, yet a 9-pt. match between two players
of equal rating would boost the winner's rating by 6 pts. (assume K = 1)
and reduce the loser's rating by 6 pts.  Obviously in this case, the rating
system would be too unreliable for use.  Many players would be massively
(in a relative sense) over- or under-rated.

I assume that 2000 was chosen so that there would be a reasonable spread
between the high and low ratings.

[Note: below I will define "ELO-style rating formula" to be a rating
formula similar to the FIBS one in which P_upset = 1 / (10^ (D * sqrt(n)/c)
+ 1), for c > 0.)

Assuming that an ELO-style rating formula is appropriate (although there is
a lot of evidence to the contrary), c should be chosen so that the
match-adjustment [4 * K * SQRT (n) * P] moves/changes ratings at a
relatively slow (but not too slow rate).  If c is chosen too low, then
ratings will move too fast, i.e. they will be too volatile and thus
unreliable.  If c is chosen too high then the rating system will take a
very long time to correct the ratings of those who are significantly under-
or over-rated.

Personally, I think that the FIBS ratings system is a bit too volatile.  I
would like to see c = 4000.  Better yet, to avoid having to scale
everyone's mean-adjusted rating by 2 overnight (and thus alarming many new
users), we can equivalently just change the match-adjustment factor to [2 *
K * SQRT (n) * P].

As noted by some of the empirical evidence referenced in Gary Wong's recent
post, an ELO-style rating formula is not robust over the possible match
lengths.  Perhaps a better solution (requiring more housekeeping) would be
to have separate rating formulas for different match lengths.  Perhaps
there could be five different ratings: one for 1-pt. matches, one for 2-pt.
matches, one for 3-6 pt. matches, one for 7-16 pt. matches, and one for 17+
pt matches. Having a separate category for 2-pt. matches may be a little
controversial since among expert players a 2-pt. match is virtually
identical to a 1-pt. match, however among novices, there is still room for
cube strategy.  :-)

Even better, the value of t (the scaling factor - see the first line of my
post) can be empirically set (different for each of the 5 rating groups) so
that the spread (high rating - low rating) in each of the 5 ratings is
approximately the same.  Perhaps one could even be assigned an "overall
rating" which would be the average of each of the 5 ratings (or maybe with
only 50% weighting on the 1-pt. and 2-pt. ratings, i.e. overall_rating =
.125 r(1) + .125 r(2) + .25 r(3-6) + .25 r(7-16) + .25 r(17+)).  This would
mean that a player has to be good at both small-length matches as well as
long-length matches in order to have a good overall rating.

Under most backgammon rating systems that I've seen, a player who plays
perfectly in 1-pt. matches (and who plays only 1-pt. matches) can obtain an
extremely high rating, even if he is awful in cube strategy (i.e. since he
will never have to make a cube decision).  My proposal would remedy this
problem.

Just my \$0.02
Chris

 Tom Keith  writes: Douglas Zare and Adam Stocks comment on this posting in their article "Ratings: A Mathematical Study" .

### Ratings

Constructing a ratings system  (Matti Rinta-Nikkola, Dec 1998)
Converting to points-per-game  (David Montgomery, Aug 1998)
Cube error rates  (Joe Russell+, July 2009)
Different length matches  (Jim Williams+, Oct 1998)
Different length matches  (Tom Keith, May 1998)
ELO system  (seeker, Nov 1995)
Effect of droppers on ratings  (Gary Wong+, Feb 1998)
Emperical analysis  (Gary Wong, Oct 1998)
Error rates  (David Levy, July 2009)
Experience required for accurate rating  (Jon Brown+, Nov 2002)
FIBS rating distribution  (Gary Wong, Nov 2000)
FIBS rating formula  (Patti Beadles, Dec 2003)
FIBS vs. GamesGrid ratings  (Raccoon+, Mar 2006)
Fastest way to improve your rating  (Backgammon Man+, May 2004)
Field size and ratings spread  (Daniel Murphy+, June 2000)
Improving the rating system  (Matti Rinta-Nikkola, Nov 2000)
KG rating list  (Daniel Murphy, Feb 2006)
KG rating list  (Tapio Palmroth, Oct 2002)
MSN Zone ratings flaw  (Hank Youngerman, May 2004)
No limit to ratings  (David desJardins+, Dec 1998)
On different sites  (Bob Newell+, Apr 2004)
Opponent's strength  (William Hill+, Apr 1998)
Possible adjustments  (Christopher Yep+, Oct 1998)
Rating versus error rate  (Douglas Zare, July 2006)
Ratings and rankings  (Chuck Bower, Dec 1997)
Ratings and rankings  (Jim Wallace, Nov 1997)
Ratings on Gamesgrid  (Gregg Cattanach, Dec 2001)
Ratings variation  (Kevin Bastian+, Feb 1999)
Ratings variation  (FLMaster39+, Aug 1997)
Ratings variation  (Ed Rybak+, Sept 1994)
Strange behavior with large rating difference  (Ron Karr, May 1996)
Table of ratings changes  (Patti Beadles, Aug 1994)
Table of win rates  (William C. Bitting, Aug 1995)
Unbounded rating theorem  (David desJardins+, Dec 1998)
What are rating points?  (Lou Poppler, Apr 1995)
Why high ratings for one-point matches?  (David Montgomery, Sept 1995)

 Book Suggestions Books Cheating Chouettes Computer Dice Cube Handling Cube Handling in Races Equipment Etiquette Extreme Gammon Fun and frustration GNU Backgammon History Jellyfish Learning Luck versus Skill Magazines & E-zines Match Archives Match Equities Match Play Match Play at 2-away/2-away Miscellaneous Opening Rolls Pip Counting Play Sites Probability and Statistics Programming Propositions Puzzles Ratings Rollouts Rules Rulings Snowie Software Source Code Strategy--Backgames Strategy--Bearing Off Strategy--Checker play Terminology Theory Tournaments Uncategorized Variations