Forum Archive :
Ratings
Hank Youngerman wrote: [Paraphrasing]
> What is the effect of the "2000" in "SQR(n) * (rating difference) / 2000"
> (from the FIBS rating formula)? Is it empirically derived?
2000 is actually a constant (call it c). If c is changed to 2000 * t (with
t > 0), the average rating will remain the same (approx. 1500). Ratings
which differ from the population average will be scaled away from the
average by a factor of t, i.e. new_rating = avg_rating + t * (old_rating
 avg_rating).
e.g. if avg_rating = 1500, c= 400 (i.e. t = 0.2), then
new_rating = 1500 = 0.2 * (old_rating  1500)
e.g. 2000 under the old ratings will correspond to 1600 under the new
ratings. 1000 under the old ratings will correspond to 1400 under the
new ratings. Old ratings between 1000 and 2000 have an equivalent new
rating between 1400 and 1600, which can be obtained by linear
interpolation.
Thus, the choice of c = 2000 only affects the spread of the ratings, while
not changing the ordering of "true" ratings. i.e. consider two players
(player1 and player2): if player1_true_rating > player2_true_rating for c =
2000 then player1_true_rating > player2_true_rating for any other c > 0
(and viceversa).
Note that the above discussion is referring to one's "true" rating. For
small values of c there is a large amount of noise in the ratings system
(since the adjustment factor , 4 * K * SQRT (n) * P, is independent of c),
i.e. ratings will move relatively more quickly with a small c than with a
large c. E.g. in an extreme case, if c = 1, then the "true" ratings would
likely range from 1499.75 to 1500.25, yet a 9pt. match between two players
of equal rating would boost the winner's rating by 6 pts. (assume K = 1)
and reduce the loser's rating by 6 pts. Obviously in this case, the rating
system would be too unreliable for use. Many players would be massively
(in a relative sense) over or underrated.
I assume that 2000 was chosen so that there would be a reasonable spread
between the high and low ratings.
[Note: below I will define "ELOstyle rating formula" to be a rating
formula similar to the FIBS one in which P_upset = 1 / (10^ (D * sqrt(n)/c)
+ 1), for c > 0.)
Assuming that an ELOstyle rating formula is appropriate (although there is
a lot of evidence to the contrary), c should be chosen so that the
matchadjustment [4 * K * SQRT (n) * P] moves/changes ratings at a
relatively slow (but not too slow rate). If c is chosen too low, then
ratings will move too fast, i.e. they will be too volatile and thus
unreliable. If c is chosen too high then the rating system will take a
very long time to correct the ratings of those who are significantly under
or overrated.
Personally, I think that the FIBS ratings system is a bit too volatile. I
would like to see c = 4000. Better yet, to avoid having to scale
everyone's meanadjusted rating by 2 overnight (and thus alarming many new
users), we can equivalently just change the matchadjustment factor to [2 *
K * SQRT (n) * P].
As noted by some of the empirical evidence referenced in Gary Wong's recent
post, an ELOstyle rating formula is not robust over the possible match
lengths. Perhaps a better solution (requiring more housekeeping) would be
to have separate rating formulas for different match lengths. Perhaps
there could be five different ratings: one for 1pt. matches, one for 2pt.
matches, one for 36 pt. matches, one for 716 pt. matches, and one for 17+
pt matches. Having a separate category for 2pt. matches may be a little
controversial since among expert players a 2pt. match is virtually
identical to a 1pt. match, however among novices, there is still room for
cube strategy. :)
Even better, the value of t (the scaling factor  see the first line of my
post) can be empirically set (different for each of the 5 rating groups) so
that the spread (high rating  low rating) in each of the 5 ratings is
approximately the same. Perhaps one could even be assigned an "overall
rating" which would be the average of each of the 5 ratings (or maybe with
only 50% weighting on the 1pt. and 2pt. ratings, i.e. overall_rating =
.125 r(1) + .125 r(2) + .25 r(36) + .25 r(716) + .25 r(17+)). This would
mean that a player has to be good at both smalllength matches as well as
longlength matches in order to have a good overall rating.
Under most backgammon rating systems that I've seen, a player who plays
perfectly in 1pt. matches (and who plays only 1pt. matches) can obtain an
extremely high rating, even if he is awful in cube strategy (i.e. since he
will never have to make a cube decision). My proposal would remedy this
problem.
Just my $0.02
Chris




Ratings
 Constructing a ratings system (Matti RintaNikkola, Dec 1998)
 Converting to pointspergame (David Montgomery, Aug 1998)
 Cube error rates (Joe Russell+, July 2009)
 Different length matches (Jim Williams+, Oct 1998)
 Different length matches (Tom Keith, May 1998)
 ELO system (seeker, Nov 1995)
 Effect of droppers on ratings (Gary Wong+, Feb 1998)
 Emperical analysis (Gary Wong, Oct 1998)
 Error rates (David Levy, July 2009)
 Experience required for accurate rating (Jon Brown+, Nov 2002)
 FIBS rating distribution (Gary Wong, Nov 2000)
 FIBS rating formula (Patti Beadles, Dec 2003)
 FIBS vs. GamesGrid ratings (Raccoon+, Mar 2006)
 Fastest way to improve your rating (Backgammon Man+, May 2004)
 Field size and ratings spread (Daniel Murphy+, June 2000)
 Improving the rating system (Matti RintaNikkola, Nov 2000)
 KG rating list (Daniel Murphy, Feb 2006)
 KG rating list (Tapio Palmroth, Oct 2002)
 MSN Zone ratings flaw (Hank Youngerman, May 2004)
 No limit to ratings (David desJardins+, Dec 1998)
 On different sites (Bob Newell+, Apr 2004)
 Opponent's strength (William Hill+, Apr 1998)
 Possible adjustments (Christopher Yep+, Oct 1998)
 Rating versus error rate (Douglas Zare, July 2006)
 Ratings and rankings (Chuck Bower, Dec 1997)
 Ratings and rankings (Jim Wallace, Nov 1997)
 Ratings on Gamesgrid (Gregg Cattanach, Dec 2001)
 Ratings variation (Kevin Bastian+, Feb 1999)
 Ratings variation (FLMaster39+, Aug 1997)
 Ratings variation (Ed Rybak+, Sept 1994)
 Strange behavior with large rating difference (Ron Karr, May 1996)
 Table of ratings changes (Patti Beadles, Aug 1994)
 Table of win rates (William C. Bitting, Aug 1995)
 Unbounded rating theorem (David desJardins+, Dec 1998)
 What are rating points? (Lou Poppler, Apr 1995)
 Why high ratings for onepoint matches? (David Montgomery, Sept 1995)
From GammOnLine
Long message
Recommended reading
Recent addition

 
