H. G. Muller wrote on Tue, May 20, 2008 06:43 PM UTC:
Well, to get an impression at what you can expect: In my first versions of
Joker80 I still used the Larry-Kaufman piece values of 8x8 Chess. So the
Bishop was half a Pawn too low, nearly equal to the Knight (as with more
than 5 Pawns, Kaufman has a Knight worth more than a lone Bishop,
neutraling a large part of the pair bonus.) Now unlike a Rook, a Bishop is
very easy to trade for a Knight, as both get into play early. Making the
trade usually wrecks the opponent's pawn structure by creating a doubled
Pawn, giving enough compensation to make it attractive.
So in almost all games Joker played with two Knights against two Bishops
after 12 moves or so. Fixing that did increase the playing strength by
~100 Elo points. So where the old version would score 50%, the improved
version would score 57%.
Now a similarly bad value for the Rook would manifest itself much more
difficultly: the Rooks get into play late, there is no nearly equal piece
for which a 1:1 trade changes sign, and you would need 1:3 trades (R vs
B+2P) or 2:2 trades (R+P for N+N), which are much more difficult to set
up. So I would expect that being half a Pawn off on the Rook value would
only reduce your score by about 3%, rather than 7% as with the Bishop.
After playing 100 games, the score differs by more than 3% from the true
win probability more often than not. So you would need at least 400 games
to show with minimal confidence that there was a difference.
Beware that the result of the games are stochastic quantities. Replay the
game at the same time control, and the game Joker80 plays will be
different. And often the result will be different. This is true at 1 sec
per move, but it is equally true at 1 year per move. The games that will
be played, are just a sample from the myriads of games Joker80 could play
with non-zero probability. And with fewer than 400 games, the difference
between the actually measured score percentage and the probability you
want to determine will be in most cases larger than the effect of the
piece values, if they are not extremey wrong (e.g. setting Q < B).