Check out Atomic Chess, our featured variant for November, 2024.

Enter Your Reply

The Comment You're Replying To
Derek Nalls wrote on Sun, May 4, 2008 06:38 AM UTC:
'I never found any effect of the time control on the scores I measure for
some material imbalance. Within statistical error, the combinations I
tries produced the same score at 40/15', 40/20', 40/30', 40/40',
40/1', 40/2', 40/5'. Going to even longer TC is very expensive, and I
did not consider it worth doing just to prove that it was a waste of
time...'
_________

The additional time I normally give to playtesting games to improve the
move quality is partially wasted because I can only control the time per
move instead of the number of plies completed using most chess variant
programs.  This usually results in the time expiring while it is working
on an incomplete ply.  Then, it prematurely spits out a move
representative of an incomplete tour of the moves available within that
ply at a random fraction of that ply.  Since there is always more than one
move (often, a few-several) under evaluation as being the best possible
move [Otherwise, the chosen move would have already been executed.], this
means that any move on this 'list of top candidates' is equally likely
to be randomly executed.

Here are two typical scenarios that should cover what usually happens:

A.  If the list of top candidates in an 11-ply search consists of 6 moves
where the list of top candidates in a 10-ply search consists of 7 moves,
then only 1 discovered-to-be-less-than-the-best move has been successfully
excluded and cannot be executed.  

Of course, an 11-ply search completion may typically require est. 8-10
times as much time as the search completions for all previous plies (1-ply
thru 10-ply) up until then added together.

OR

B.  If the list of top candidates in an 11-ply search consists of 7 moves
[Moreover, the exact same 7 moves.] just as the preceding 10-ply search, 
then there is no benefit at all in expending 8-10 times as much time.
______________________________________________________________

The reason I endure this brutal waiting game is not for purely masochistic
experience but because the additional time has a tangible chance (although
no guarantee) of yielding a better move with every occasion.  Throughout
the numerous moves within a typical game, it can be realistically expected
to yield better moves on dozens of occasions.

We usually playtest for purposes at opposite extremes of the spectrum 
yet I regard our efforts as complimentary toward building a complete 
picture involving material values of pieces.

You use 'asymmetrical playtesting' with unequal armies on fast time 
controls, collect and analyze statistics ... to determine a range, with a
margin of error, for individual material piece values.

I remain amazed (although I believe you) that you actually obtain any 
meaningful results at all via games that are played so quickly that the AI
players do not have 'enough time to think' while playing games so complex
that every computer (and person) needs time to think to play with minimal
competence.  Can you explain to me in a way I can understand how and why
you are able to successfully obtain valuable results using this method? 
The quality of your results was utterly surprising to me.  I apologize for
totally doubting you when you introduced your results and mentioned how you
obtained them.

I use 'symmetrical playtesting' with identical armies on very slow time
controls to obtain the best moves realistically possible from an
evaluation function thereby giving me a winner (that is by some margin
more likely than not deserving) ... to determine which of two sets of
material piece values is probably (yet not certainly) better. 
Nonetheless, as more games are likewise played ...  If they present a
clear pattern, then the results become more probable to be reliable, 
decisive and indicative of the true state of affairs.

The chances of flipping a coin once and it landing 'heads' are equal to
it landing 'tails'.  However, the chances of flipping a coin 7 times and
it landing 'heads' all 7 times in a row are 1/128.  Now, replace the
concepts 'heads' and 'tails' with 'victory' and 'defeat'.  I
presume you follow my point.

The results of only a modest number of well-played games can definitely
establish their significance beyond chance and to the satisfaction of 
reasonable probability for a rational human mind.  [Most of us, including
me, do not need any better than a 95%-99% success to become convinced that
there is a real correlation at work even though such is far short of an
absolute 100% mathematical proof.]

In my experience, I have found that using any less than 10 minutes per
move will cause at least one instance within a game when an AI player
makes a move that is obvious to me (and correctly assessed as truly being)
a poor move.  Whenever this occurs, it renders my playtesting results 
tainted and useless for my purposes.  Sometimes this occurs during a 
game played at 30 minutes per move.  However, this rarely occurs during 
a game played at 90 minutes per move.

For my purposes, it is critically important above all other considerations
that the winner of these time-consuming games be correctly determined 
'most of the time' since 'all of the time' is impossible to assure.
I must do everything within my power to get as far from 50% toward 100%
reliability in correctly determining the winner.  Hence, I am compelled to
play test games at nearly the longest survivable time per move to minimize
the chances that any move played during a game will be an obviously poor 
move that could have changed the destiny of the game thereby causing 
the player that should have won to become the loser, instead.  In fact, 
I feel as if I have no choice under the circumstances.

Edit Form
Conduct Guidelines
This is a Chess variants website, not a general forum.
Please limit your comments to Chess variants or the operation of this site.
Keep this website a safe space for Chess variant hobbyists of all stripes.
Because we want people to feel comfortable here no matter what their political or religious beliefs might be, we ask you to avoid discussing politics, religion, or other controversial subjects here. No matter how passionately you feel about any of these subjects, just take it someplace else.
Avoid Inflammatory Comments
If you are feeling anger, keep it to yourself until you calm down. Avoid insulting, blaming, or attacking someone you are angry with. Focus criticisms on ideas rather than people, and understand that criticisms of your ideas are not personal attacks and do not justify an inflammatory response.
Quick Markdown Guide

By default, new comments may be entered as Markdown, simple markup syntax designed to be readable and not look like markup. Comments stored as Markdown will be converted to HTML by Parsedown before displaying them. This follows the Github Flavored Markdown Spec with support for Markdown Extra. For a good overview of Markdown in general, check out the Markdown Guide. Here is a quick comparison of some commonly used Markdown with the rendered result:

Top level header: <H1>

Block quote

Second paragraph in block quote

First Paragraph of response. Italics, bold, and bold italics.

Second Paragraph after blank line. Here is some HTML code mixed in with the Markdown, and here is the same <U>HTML code</U> enclosed by backticks.

Secondary Header: <H2>

  • Unordered list item
  • Second unordered list item
  • New unordered list
    • Nested list item

Third Level header <H3>

  1. An ordered list item.
  2. A second ordered list item with the same number.
  3. A third ordered list item.
Here is some preformatted text.
  This line begins with some indentation.
    This begins with even more indentation.
And this line has no indentation.

Alt text for a graphic image

A definition list
A list of terms, each with one or more definitions following it.
An HTML construct using the tags <DL>, <DT> and <DD>.
A term
Its definition after a colon.
A second definition.
A third definition.
Another term following a blank line
The definition of that term.