Forum Archive :
Ratings
Error rates, whether expressed in terms of match winning chances (MWC) or
equivalent to money games (EMG) are used for (too?) many purposes and the
statistics are not useful for all of the intended purposes.
Error rates expressed as MWC have a very useful purpose when combined with
luck expressed as MWC. A match starts with each player 50% to win. The net
luck and net error rate will by definition be 50%. An example from a recent
match:
P1 error rate 12% MWC; luck +37% MWC
P2 error rate 17% MWC; luck 8% MWC
Net error rate 5% MWC; luck +45% MWC total +50% MWC
This is a matter of definition and will always hold true if you use the
same neural net and the same parameters for error rates and luck.
Now these numbers are very useful in terms of understanding why you won or
lost a match (P1 was outplayed a bit, but substantially outrolled], but
are very unsatisfying for all the reasons that led to the EMG calculation.
EMG attempts to normalize error rates to overcome at least two perceived
problems. First, the "same" error with a higher cube results in a greater
loss of MWC than with a lower cube. Second the "same" error later in a
match results in a greater loss of MWC than earlier.
So, EMG is useful for asking the question "what were my n biggest errors in
the match." The student can study these errors and improve play.
Then we average EMG over the number of moves to get a Snowie error rate
(whether calculated over all moves as Snowie does or unforced moves as
gnubg does) and get a number. And we tend to use that number to express how
good a player is. "X plays at error rate 2 or 5 or 10" each says something
about player X that we all understand intuitively. While over a huge number
of matches, that is a good approximation, in any one match, it doesn't
necessarily work.
Moral: Don't expect too much of the Snowie error rate, particular for a
single match.




Ratings
 Constructing a ratings system (Matti RintaNikkola, Dec 1998)
 Converting to pointspergame (David Montgomery, Aug 1998)
 Cube error rates (Joe Russell+, July 2009)
 Different length matches (Jim Williams+, Oct 1998)
 Different length matches (Tom Keith, May 1998)
 ELO system (seeker, Nov 1995)
 Effect of droppers on ratings (Gary Wong+, Feb 1998)
 Emperical analysis (Gary Wong, Oct 1998)
 Error rates (David Levy, July 2009)
 Experience required for accurate rating (Jon Brown+, Nov 2002)
 FIBS rating distribution (Gary Wong, Nov 2000)
 FIBS rating formula (Patti Beadles, Dec 2003)
 FIBS vs. GamesGrid ratings (Raccoon+, Mar 2006)
 Fastest way to improve your rating (Backgammon Man+, May 2004)
 Field size and ratings spread (Daniel Murphy+, June 2000)
 Improving the rating system (Matti RintaNikkola, Nov 2000)
 KG rating list (Daniel Murphy, Feb 2006)
 KG rating list (Tapio Palmroth, Oct 2002)
 MSN Zone ratings flaw (Hank Youngerman, May 2004)
 No limit to ratings (David desJardins+, Dec 1998)
 On different sites (Bob Newell+, Apr 2004)
 Opponent's strength (William Hill+, Apr 1998)
 Possible adjustments (Christopher Yep+, Oct 1998)
 Rating versus error rate (Douglas Zare, July 2006)
 Ratings and rankings (Chuck Bower, Dec 1997)
 Ratings and rankings (Jim Wallace, Nov 1997)
 Ratings on Gamesgrid (Gregg Cattanach, Dec 2001)
 Ratings variation (Kevin Bastian+, Feb 1999)
 Ratings variation (FLMaster39+, Aug 1997)
 Ratings variation (Ed Rybak+, Sept 1994)
 Strange behavior with large rating difference (Ron Karr, May 1996)
 Table of ratings changes (Patti Beadles, Aug 1994)
 Table of win rates (William C. Bitting, Aug 1995)
 Unbounded rating theorem (David desJardins+, Dec 1998)
 What are rating points? (Lou Poppler, Apr 1995)
 Why high ratings for onepoint matches? (David Montgomery, Sept 1995)
From GammOnLine
Long message
Recommended reading
Recent addition

 
