Derek Nalls wrote on Sun, May 4, 2008 06:38 AM UTC:
'I never found any effect of the time control on the scores I measure for
some material imbalance. Within statistical error, the combinations I
tries produced the same score at 40/15', 40/20', 40/30', 40/40',
40/1', 40/2', 40/5'. Going to even longer TC is very expensive, and I
did not consider it worth doing just to prove that it was a waste of
time...'
_________
The additional time I normally give to playtesting games to improve the
move quality is partially wasted because I can only control the time per
move instead of the number of plies completed using most chess variant
programs. This usually results in the time expiring while it is working
on an incomplete ply. Then, it prematurely spits out a move
representative of an incomplete tour of the moves available within that
ply at a random fraction of that ply. Since there is always more than one
move (often, a few-several) under evaluation as being the best possible
move [Otherwise, the chosen move would have already been executed.], this
means that any move on this 'list of top candidates' is equally likely
to be randomly executed.
Here are two typical scenarios that should cover what usually happens:
A. If the list of top candidates in an 11-ply search consists of 6 moves
where the list of top candidates in a 10-ply search consists of 7 moves,
then only 1 discovered-to-be-less-than-the-best move has been successfully
excluded and cannot be executed.
Of course, an 11-ply search completion may typically require est. 8-10
times as much time as the search completions for all previous plies (1-ply
thru 10-ply) up until then added together.
OR
B. If the list of top candidates in an 11-ply search consists of 7 moves
[Moreover, the exact same 7 moves.] just as the preceding 10-ply search,
then there is no benefit at all in expending 8-10 times as much time.
______________________________________________________________
The reason I endure this brutal waiting game is not for purely masochistic
experience but because the additional time has a tangible chance (although
no guarantee) of yielding a better move with every occasion. Throughout
the numerous moves within a typical game, it can be realistically expected
to yield better moves on dozens of occasions.
We usually playtest for purposes at opposite extremes of the spectrum
yet I regard our efforts as complimentary toward building a complete
picture involving material values of pieces.
You use 'asymmetrical playtesting' with unequal armies on fast time
controls, collect and analyze statistics ... to determine a range, with a
margin of error, for individual material piece values.
I remain amazed (although I believe you) that you actually obtain any
meaningful results at all via games that are played so quickly that the AI
players do not have 'enough time to think' while playing games so complex
that every computer (and person) needs time to think to play with minimal
competence. Can you explain to me in a way I can understand how and why
you are able to successfully obtain valuable results using this method?
The quality of your results was utterly surprising to me. I apologize for
totally doubting you when you introduced your results and mentioned how you
obtained them.
I use 'symmetrical playtesting' with identical armies on very slow time
controls to obtain the best moves realistically possible from an
evaluation function thereby giving me a winner (that is by some margin
more likely than not deserving) ... to determine which of two sets of
material piece values is probably (yet not certainly) better.
Nonetheless, as more games are likewise played ... If they present a
clear pattern, then the results become more probable to be reliable,
decisive and indicative of the true state of affairs.
The chances of flipping a coin once and it landing 'heads' are equal to
it landing 'tails'. However, the chances of flipping a coin 7 times and
it landing 'heads' all 7 times in a row are 1/128. Now, replace the
concepts 'heads' and 'tails' with 'victory' and 'defeat'. I
presume you follow my point.
The results of only a modest number of well-played games can definitely
establish their significance beyond chance and to the satisfaction of
reasonable probability for a rational human mind. [Most of us, including
me, do not need any better than a 95%-99% success to become convinced that
there is a real correlation at work even though such is far short of an
absolute 100% mathematical proof.]
In my experience, I have found that using any less than 10 minutes per
move will cause at least one instance within a game when an AI player
makes a move that is obvious to me (and correctly assessed as truly being)
a poor move. Whenever this occurs, it renders my playtesting results
tainted and useless for my purposes. Sometimes this occurs during a
game played at 30 minutes per move. However, this rarely occurs during
a game played at 90 minutes per move.
For my purposes, it is critically important above all other considerations
that the winner of these time-consuming games be correctly determined
'most of the time' since 'all of the time' is impossible to assure.
I must do everything within my power to get as far from 50% toward 100%
reliability in correctly determining the winner. Hence, I am compelled to
play test games at nearly the longest survivable time per move to minimize
the chances that any move played during a game will be an obviously poor
move that could have changed the destiny of the game thereby causing
the player that should have won to become the loser, instead. In fact,
I feel as if I have no choice under the circumstances.