[ Help | Earliest Comments | Latest Comments ]

[ List All Subjects of Discussion | Create New Subject of Discussion ]

[ List Earliest Comments Only For Pages | Games | Rated Pages | Rated Games | Subjects of Discussion ]

Comments by HGMuller

⇧Earliest ⇧Earlier ⇧Reverse Order⇩ Later⇧ Latest⇩

Capablanca's chess. An enlarged chess variant, proposed by Capablanca. (10x8, Cells: 80) (Recognized!)[All Comments] [Add Comment or Rating]

H. G. Muller wrote on Thu, May 22, 2008 07:10 AM UTC:

'Have you tried the Modern Capablanca Random Chess viariant with your engines?'

No, my engines do not have FRC-type castling ability yet. It is still on my to-do list for Joker80, together with allowing it to play on 8x8 by filling up part of the board with impassible objects. (It already uses such objects to confine the pieces to 10x8, as its internal board is 32x12, so this is a minor change; it just has to adapt the positional center-points table to where the new corners are. And of course use a different type of castling.) The main objective would be to play in FRC competitions.

The Modern CRC variant doesn't particularly appeal to me. The resulting games should be indistinguishable from normal CRC. The only difference is the opening array. The Bishop adjustment rule is also an opening thing. Opening theory never had much appeal to me, I consider it the dullest part of Chess. None of my engines ever had an opening book, even in variants like 8x8 FIDE, where extensive opening theory exists. The Bishop adjustment rule seems awkward from an aestethic point of view, and half-hearted from a logical point of view: first you change the rules by allowing arrays with like Bishops, and then you largely subvert the effect of itby allowing the adjustment. As the disadvantage of having the Bishops on like colors was measured by me to be half a Pawn, not doing it would be very poor strategy.

For exploring the possibilities like Bishops offer, it would be much cleaner to augment the Bishop with a single orthognal backward step as non-capture only. Then people can actually use it without hesitation, as they can always undo the effect later. The extra move of such a 'Naughty Bishop' hardly has any tactical value in itsels, as it is a non-capture, and directed backwards. It added only about 15 cP to the piece value. Introducing a piece of different gait is much cleaner than adding a special, complicated rule.

The symmetric castling seems to add nothing, it looks just like a difference for the sake of being different. The same holds for the inversion symmetry in stead of vertical-flip symmetry. This doesn't mean this would be a poor game to play, of course. But I think such irrelevant differences do make it a poor design as a CV.

Piece Values[Subject Thread] [Add Response]

H. G. Muller wrote on Thu, May 22, 2008 08:07 AM UTC:

'I cannot speak for Reinhard Scharnagl at all, though.'

This is exactly the problem. 'base value' for Pawns is a very
ill-defined concept, as it is the smallest of all piece base values, while
the positional terms regarding to Pawns are usually the largest of all
positional terms. And the whole issue of pawn-structure evaluation in
Joker is so complex that I am not even sure if the average of positional
terms (over all pawns and over a typical game) is positive or negative.
Pawns get penalties for being doubled, or having no Pawns next or behind
them on neigboring files. They get points for advancing, but they get
penalties for creating squares that no longer can be defended by any Pawn.
My guess is that in general, the positional terms are slightly positive,
even for non-passers not involved in King Safety.

A statement like 'a Knight is worth exactly 3 Pawns' is only meaningful
after exactly specifying which kind of pawn. If the Scharnagl model
evaluates all non-passers exactly the same (except, perhaps, edge Pawns),
then the question still arises how to most-closely approximate that in
Joker80, which doesn't. And simply setting the Joker80 base value equal
to the single value of the Scharnagle model is very unlikely to do it. 

Good differentiation in Pawn evaluation is likely to impact play strength
much more than the relative value of Pawns and Pieces, as Pawns are traded
for other Pawns (or such trades are declined by pushing the Pawn and
locking the chains) much more often than they can be traded for Pieces.

H. G. Muller wrote on Thu, May 22, 2008 08:13 AM UTC:

'Do you think these piece values will work smoothly with Joker80 running
under Winboard F yet remain true to all three models?'

Yes, I think these values will not conflict in anyway with any of the
hard-wired value approximates that are used for pruning decisions. At
least not to the point where it would lead to any observable effect on
playing strength. (Prunings based on the piece values occur only close to
the leaves, and engines are usually quite insensitive as to how exactly
you prune there.)

H. G. Muller wrote on Thu, May 22, 2008 07:05 PM UTC:

'Let me provide another challenge for people here regarding pawns.  How
much is a pawn that moves only one space forward (not initial 2) but
starts on the third row instead of second worth in contrast to a normal
chess pawn?  How much is it worth alone, and then in a line of pawns that
start on the third row?'

But this is a totally normal FIDE Pawn...

It would get a pretty large positional penalty if it was alone
(isolated-pawn penalty). In a complete line of pawns on the 3rd rank it
would be worth a lot more, as it would not be isolated, and not be
backward. All in all it would be fairly similar to having a line of Pawns
on second rank, as the bonus for pushing the Pawns forward 1 square is
approximately cancelled by not having Pawn control anymore over any of the
squares on the 3rd rank.

H. G. Muller wrote on Fri, May 23, 2008 08:16 AM UTC:

'Because of all this, I suggest evaluating entire configuration of
pieces,
rather than a single piece.'

This is exactly what Chess engines do. But it is a subject that transcends
piece values. Material evaluation is supposed to answer the question:
'what combination of pieces would you rather have, without knowing where
they stand on the board'. Piece values are an attempt to approximate the
material evaluation as a simple sum of the value of the individual pieces,
making up the army.

It turns out that material evaluation is by far the largest component of
the total evaluation of a Chess position. And this material evaluation
again can be closely approximated by a sum of piece values. The most
well-known exception is the Bishop pair: having two Bishops is worth about
half a Pawn more than double the value of a single Bishop. Other
non-additive terms are those that make the Bishop and Rook value dependent
on the number of Pawns present. To account for such effects some engines
(e.g. Rybka) have tabulated the total value of all possible combinations
of material (ignoring promotions) in a 'material table'. Such tables can
then also account for the material component of the evaluation that gives
the deviation from the sum of piece values due to cooperative effects
between the various pieces.

Useful as this may be, it remains true that piece values are by far the
largest contribution to the total evaluation. The only positional terms
that can compete with it are passed pawns (a Pawn on 7th rank is worth
nearly 2.5 normal Pawns) and King Safety (having a completely exposed King
in the middle game, when the opponent still has a Queen or similar
super-piece, can be worth nearly a Rook).

H. G. Muller wrote on Fri, May 23, 2008 09:36 AM UTC:

Derek Nalls:
| This might require very deep runs of moves with a completion time 
| of a few weeks to a few months per pair of games to achieve 
| conclusive results.

It still escapes me what you hope to prove by playing at such an
excessively long Time Control. If the result would be different from
playing at a a more 'normal' TC,  like one or two hours per game, (which
IMO will not be the case), it would only mean that any conclusions you draw
on them would be irrelevant for playing Chess at normal TC.

Furthermore, playing 2 games will be like flipping a coin. The result,
whatever it is, will not prove anything, as it would be different if you
would repeat the test. Experiments that do not give a fixed outcome will
tell you nothing, unless you conduct enough of them to get a good
impression on the probability for each outcome to occur.

H. G. Muller wrote on Sat, May 24, 2008 09:49 AM UTC:

Derek:
| Conclusions drawn from playing at normal time controls are
| irrelevant compared to extremely-long time controls.

First, that would only be true if the conclusions would actually depend on
the TC. Which is a totally unproven conjecture on your part, and in fact
contrary to any observation made at TCs where such observations can be
made with any accuracy (because enough games can be played). This whole thing reminds me of my friend, who always claims that stones fall upward. When I then drop a stone to refute him, he jsut shrugs, and says it proves nothing because the stone is 'not big enough'. Very conveniently for him, the upward falling of stones can only be observed on stones that are too big for anyone to lift...
But the main point is of course, if you draw a conclusion that is valid
only at a TC that no one is interested in playing, what use would such a
conclusion be?

| The chance of getting the same flip (heads or tails) twice-in-a-row
| is 1/4. Not impressive but a decent beginning. Add a couple or a
| few or several consecutive same flips and it departs 'luck' by a
| huge margin.

Actually the chance for twice the same flip in a row is 1/2. Unless you
are biased as to what the outcome of the flip should be (one-sided
testing). And indeed, 10 identical flips in a row would be unlikely to
occur by luck by a large margin. But that is rather academic, because you
won't see 10 identical results in a row between the subtly different
models. You will see results like 6-4 or 7-3, which will again be very
likely to be a result of luck (as that is exactly what they are the result
of, as you would realize after 10,000 games when the result is standing at
4,628-5,372).

Calculate the number of games you need to typically get a result for a
53-47 advantage that could not just as easily have been obtained from a
50-50 chance with a little luck. You will be surprised...

| I have wondered why the performance of computer chess programs is
| unpredictable and varied even under identical controls. Despite
| their extraordinary complexity, I think of computer hardware,
| operating systems and applications (such as Joker80) as deterministic.

In most engines there alwas is some residual indeterminism, due to timing
jitter. There are critical decision points, where the engine decides if it
should do one more iteration or not (or search one more move vs aborting
the iteration). If it would take such decisions purely on internal data,
like node count, it would play 100% reproducible. But most engines use the
system clock, (to not forfeit on time if the machine is also running other
tasks), and experience the timing jitter caused by other processes
running, or rotational delays of the hard disk they had been using. In
multi-threaded programs this is even worse, as the scheduling of the
threads by the OS is unpredictable. Even the position where exactly the
program is loaded in physical memory might have an effect.

But in Joker the source of indeterminism is much less subtle: it is
programmed explicitly. Joker uses the starting time of the game as the
seed of a pseudo-random-number generator, and uses the random numbers
generated with the latter as a small addition to the evaluation, in order
to lift the degeneracy of exactly identical scores, and provide a bias for
choosing the move that leads to the widest choice of equivalent positions
later.

The non-determanism is a boon, rather than a bust, as it allows you to
play several games from an identical position, and still do a meaningful
sampling of possible games, and of the decisions that lead to their
results. If one position would always lead to the same game, with the same
result (as would occur if you were playing a simple end-game with the aid
of tablebases), it would not tell you anything about the relative strength
of the armies. It would only tell you that this particular position was won
/ drawn. But noting about the millions of other positons with the same
material on the board. And the value of the material is by definition an
average over all these positions. So with deterministic play, you would be
forced to sample the initial positions, rather than using the indeterminism
of the engine to create a representative sample of positions before
anything is decided.

| In fact, to the extent that your remarks are true, they will
| support my case if my playtesting is successful that the
| unlikelihood of achieving the same outcome (i.e., wins or
| losses for one player) is extreme.
This sentence is to complicated for me to understand. 'Your case' is
that 'the unlikelyhood of achieving the same outcome is extreme'? If the
unlikelyhood is extreme, is that the same as that the likelyhood is
extreme? Is the 'unlikelyhood to be the same' the same as the
'likelyhood to be different'? What does 'extreme' mean for a
likelyhood? Extremely low or extremely high? I wonder if anything is
claimed here at all...

I think you make a mistake by seeing me as a low-quality advocate. I only
advocate minimum quantity to not make the results inconclusive.
Unfortunately, that is high, despite my best efforts to make it as low as
possible through asymmetric playtesting and playing material imbalances in
pairs (e.g. 2 Chancellors agains two Archbisops, rather than one vs one).
And that minimum quantity puts limits to the maximum quality that I can
afford with my limited means. So it would be more accurate to describe me
as a minimum-(significant)-quantity, maximum-(affordable)-quality
advocate...

H. G. Muller wrote on Sun, May 25, 2008 09:14 AM UTC:

'Do not you realize that forcing Joker80 to do otherwise must reduce its
playing strength significantly from its maximum potential?'

On the contrary, it makes it stronger. The explanation is that by adding a
random value to the evaluation, branches with very many equal end leaves
have a much larger probability to have the highest random bonus amongst
them than a branch that leads to only a single end-leaf of that same
score.

The difference can be observed most dramatically when you evaluate all
positions as zero. This makes all moves totally equivalent at any search
depth. Such a program would always play the first legal move it finds, and
would spend the whole game moving its Rook back and forth between a1 and
b1, while the opponent is eating all its other pieces. OTOH, a program
that evaluates every position as a completely random number starts to play
quite reasonable ches, once the search reaches 8-10 ply. Because it is
biased to seek out moves that lead to pre-horizon nodes that have the
largest number of legal moves, which usually are the positions where the
strongest pieces are still in its possession.

It is always possible to make the random addition so small that it only
decides between moves that would otherwise have exactly equal evaluation.
But this is not optimal, as it would then prefer a move (in the root) that
could lead (after 10 ply or so) to a position of score 53 (centiPawn),
while all other choices later in the PV would lead to -250 or worse, over
a move that could lead to 20 different positions (based on later move
choices) all evaluating as 52cP. But, as the scores were just
approximations based on finite-depth search, two moves later, when it can
look ahead further, all the end-leaf scores will change from what they
were, because those nodes are now no longer end-leaves. The 53 cP might
now be 43cP because deeper search revealed it to disappoint by 10cP. But
alas, there is no choice: the alternatives in this branch might have
changed a little too, but now all range from -200 to -300. Not much help,
whe have to settle for the 43cP... 

Had it taken the root move that keeps the option open to go to any of the
20 positions of 52cP, it would now see that their scores on deeper search
would have been spread out between 32cP and 72cP, and it could now go for
the 72cP. In other words, the investment of keeping its options open
rather than greedily commit itself to going for an uncertain, only
marginally better score, typically pays off. 

To properly weight the expected pay-back of keeping options that at the
current search depth seem inferior, it must have an idea of the typical
change of a score from one search depth to the next. And match the size of
the random eval addition to that, to make sure that even sligtly (but
insignificantly) worse end-leaves still contribute to enhancing the
probability that the branch will be chosen. Playing a game in the face of
an approximate (and thus noisy) evaluation is all about contingency
planning.

As to the probability theory, you don't seem to be able to see the math
because of the formulae...

P(hh) = 0.5*0.5 = 0.25
P(tt) = 0.5*0.5 = 0.25
______________________+
P(two equal)    = 0.5

H. G. Muller wrote on Sun, May 25, 2008 11:13 AM UTC:

Indeed, it is a stochastic way to simulate mobility evaluation. In the presence of other terms it should of course not be made so large that it dominates the total evaluation. Like explicit mobility terms should not dominate the evaluation. But its weight should not be set to zero either: properly weighted mobility might add more than 100 Elo to an engine.

Joker has no explicit mobility in its evaluation, and relies entirely on
this probabilistic mechanism to simulate it. The disadvantage is that,
because of the probabilistic nature, it is not 100% guaranteed to always
take the best decision. On rare occasions the single acceptable end leave
does draw a higher random bonus than one-hundred slightly better positions
in another branch. OTOH it is extremely cheap to implement, while explicit
mobility is very expensive. As a result, I might gain an extra ply in
search depth. And then it becomes superior to explicit mobility, as it
only counts tactically sound moves, rather than just every move. So it is
like safe mobility verified by a full Quiescence Search.

In my assesment, the probabilistic mobility adds more strength to Joker
than changing the Rook value by 50cP would add or subtract. This can be
easily verified by play-testing. It is possible to switch this evaluation
term off. In fact, you have to switch it on, but WinBoard does this by
default. To prevent it from being switched on, one should run WinBoard
with the command-line option /firstInitString='new'. (The default
setting is 'new\nrandom'. If Joker is running as second engine, you
will of course have to use /secondInitString='new'.)

H. G. Muller wrote on Sun, May 25, 2008 03:07 PM UTC:

I would have thought that 'twice the same flip in a row' was pretty
unambiguous, especially in combination with the remark about two-sided
testing. But let's not quibble about the wording.

The point was that for two-sided testing, if you suspect a coin to be
loaded, but have no idea if it is loaded to produce tail or heads, thw two
flips tell you exactly nothing. They are either the same or different, and
on an unbiased coin that would occur with equal probability. So the
'confidence' of any conclusion as to the fairness of the coin drawn from
the two flips would be only 50%. I.e. not better than totally random, you
might as well have guessed if it was fair or not without flipping it at
all. That would also have given you a 50% chance of guessing correct.

H. G. Muller wrote on Mon, May 26, 2008 08:09 AM UTC:

Derek: 'I hope you can handle constructive advice.'

It gives me a big laugh, that's for sure.

Of course none of what you say is even remotely true. That is what happens
if you jump to conclusions regarding complex matters you are not
knowledgeable about, without even taking the trouble to verify your ideas.

Of course I extensively TESTED how the playing strength of Joker80, (and
all available other engines), varied as a function of time control. This
was the purpose of several elaborate time-odds tournament I conducted,
where various versions of most engines participated that had to play their
games in 36, 12, 4, 1:30, 0:40 or 0:24 min, where handicapped engines were meeting non-handicapped ones in a full round robin. (I.e. the handicaps were factors 3, 9, 24, 54 or 90, where only the strongest engines were handicapped upto the very maximum, and the weakest only participated in an unhandicapped version).

And of course Joker80 behaves similar to any Shannon-type engine that is reasonably free of bugs: its playing strength measured in Elo monotonically increases in a logarithmic fashion, approximately to the formula rating = 100*ln(time). So Joker80 at 5 min/move crushes Joker80 at 1 sec per move, as you could have easily found out for yourself. So that much for your nonsense about Joker80 failing to improve its move quality with time. For some discussion on one of the tournaments, see:

http://www.talkchess.com/forum/viewtopic.php?t=19764&postdays=0&postorder=asc&topic_view=flat&start=34

At that time Fairy-Max still had a hash-table bug that made it hang (and
subsequently forfeit on time) that was striking at a fixed rate per
second, so that Fairy-Max started to forfeit more and more games at longer
TC. Since then the bug has been identified and repaired, and now also
Fairy-Max performs progressively better at longer TC.

So nice try, but next time better save your breath for telling the surgeon
how to do his job before he will perform open heart surgery on you. Because
he has no doubt much more to learn from you regarding cardiology than I
have in the area of building Chess engines...

Things are as they are, and can become known by observation and testing.
Believing in misconceptions born out of ignorance is not really helpful.
Or, more explicitly: if you think you know how to build better Chess
engines than other people, by all means, do so. It will be fun to confront
your ideas with reality. In the mean time I will continue to build them as
I think best, (and know is best, through extensive testing), so you should have every chance to surpass them. Lacking that, you could at least _use_ the engines of others to check out if your theories of how they behave have any reality value. You don't have to depend on the time-odds tourneys and other tests I conduct. You might not even be aware of them, as the developers of Chess engines hardly ever publish the thousands of games they do for testing if their ideas work in practice.

H. G. Muller wrote on Mon, May 26, 2008 04:47 PM UTC:

Derek Nalls:
| Nonetheless, completing games of CRC (where a long, close, 
| well-played game can require more than 80 moves per player) 
| in 0:24 minutes - 36 minutes does NOT qualify as long or even, 
| moderate time controls.  In the case of your longest 36-minute games, 
| with an example total of 160 moves, that allows just 13.5 seconds per 
| move per player.  In fact, that is an extremely short time by any 
| serious standards.  

In my experience most games on the average take only 60 moves (perhaps
because of the large strength difference of the players). As early moves
are more important for the game result as late moves (even the best moves
late in the game do not help you if your position is already lost), most
engines use 2.5% of the remaining time for their next move (on average,
depending on how the iterations end compared to the target time). That
would be nearly 54 sec/move at 36 min/game in the decisive phase of the
game. That is more than you thought, but admittedly still fast. Note,
however, that I also played 60-min games in the General Championship
(without time odds), and that Joker80 confirms its lead over the
competitors it manifested at faster time controls.

But I don't see the point: Joker80's strength increases with time as
expected, in the range from 0.4 sec to 36 sec per move, in a regular and
theoretically expected way. This is over the entire range where I tested
the dependence of the scoring percentage of various material imbalances,
which extended to only 15 sec/move, and found it to be independent of TC.
So your 'explanation' for the latter phenomenon is just nonsense. The
effect you mention is observed NOT to occur, and thus cannot explain
anything that was observed to occur.

Now if you want to conjecture that this will all miraculously become very
different at longer TC, you are welcome to test it and show us convincing
results. I am not going to waste my computer time on such a wild and
expensive goose chase. Because from the way I know the engines work, I
know that they are 'scalable': their performance at 10 ply results from
one ply being put in front of 9-ply search trees. And that extra ply will
always help. If they have good 9-ply trees, they will have even better
10-ply trees. But you don't have to take my word for it. You have the
engine, and if you don't want to believe that at 1 hour per move you will
get the same win probability as at 1 sec/move, or that at 1 hour per move
it won't beat 10 min//move, just play the games, and you will see for
yourself. It would even be appreciated if you publish the games here or on
your website. But, needless to say, one or two games won't convince anyone
of anything.

| 'since I am not a computer chess programmer, I cannot possibly 
| know what I am talking about when I dare criticize an important 
| working of your Joker80 program'
Well, you certainly make it appear that way. As, despite the elaborate
explanation I gave of why programs derive extra strength from this
technique, you still draw a conclusion that in practice was already shown
to be 100% wrong earlier. And if you think you will run into the problem
you imagine at enormously longer TC, well, very simple: don't use
Joker80, but use some other engine. You are on your own there, as I am not
specifically interested in extremely long TC. There is always a risk in
using equipment outside the range of conditions for which it was designed
and tested, and that risk is entirely yours. So better tread carefully,
and make sure you rule out the percieved dangers by concise testing.

| You must decide upon and define the primary function of your 
| Joker80 program.

I do not see the dilemma you sketch. The purpose is to play ON AVERAGE the
best possible move. If you do that, you have the best chance to win the
game. If I can achieve that through a non-deterministic algorithm better
than through a deterministic one, I go for the nondeterministic method.
That it also diversifies play, and makes me less sensitive to prepared
openings from the opponent, is a win/win situation. Not a compromise.

As I explained, it is very easy to switch this feature off. But you should
be prepared for significant loss of strength if you do that.

H. G. Muller wrote on Mon, May 26, 2008 10:10 PM UTC:

| I just cannot understand how any rational, intelligent man could 
| believe that introducing chaos (i.e., randomness) is beneficial
| (instead of detrimental) to achieving a goal defined in terms of 
| filtering-out disorder to pinpoint order.

It would be very educational then to get yourself acquainted with the
current state of the art of Go programming, where Monte-Carlo techniques
are the most successful paradigm to date...

| When you reduce the power of your algorithm in any way to 
| filter-out inferior moves, you thereby reduce the average 
| quality of the moves chosen and consequently, you reduce 
| the playing strength of your program- esp. at long time controls.  

Exactly. This is why I _enhance_ the power of my algorithm to filter out
inferior moves. As the inferior moves have a smaller probability to draw a
large positive random bonus than the better moves. They thus have a lower
probability to be chosen, which enhances the average quality of the moves,
and thus playing strength. At any time control.

It is a pity this suppression of inferior moves is only probabilistic, and
some inferior moves by sheer luck can still penetrate the filter. But I
know of no deterministic way to achieve the same thing. So something ais
better as nothing, and I settle for the inferior moves only getting a
lower chance to pass. Even if it is not a zero chance, it is still better
than letting them pass unimpededly.

| In any event, the addition of the completely-unnecessary module of 
| code used to create the randomization effect within Joker80 that 
| you desire irrefutably makes your program larger, more complicated 
| and slower.  Can that be a good thing?

Everything you put into a Chess engine makes it larger and slower. Yet,
taking almost everything out, only leaves you with a weak engine like
micro-Max 1.6. The point is that putting code in also can make the engine
smarter, improve its strategic understanding, reduce its branching ratio,
etc. So if it is a good thing or not does not depend on if it makes the
engine larger, motre complicated, or slower. It depends on if the engine
still fits in the available memory, and from there produces better moves
in the same time. Which larger, more complicated and slower engines often
do. As always, testing is the bottom line.

Actually the 'module of code' consists only of only 6 instructions, as I
derive the pseudo-random number from the hashKey.

But the point you are missing is this: I have theoretical understanding of
how Chess engines work, and therefore are able to extrapolate their
behavior with high confidence from what I observe under other conditions
(i.e. at fast TC). Just like I don't have to travel to the Moon and back
to know its distance from the Earth, because I understand geometry and
triangulation. So I know that if including a certain evaluation term gives
me more accurate scores (and thus more reliable selection of the best move)
from 8-ply search trees, I know that this can only give better moves from
18-ply search trees. As the latter is nothing but millions of 8-ply search
trees grafted on the tips of a mathematically exact 10-ply minimax
propagation of the score from the 8-ply trees towards the root. 

Anyway, it is not of any interest to me to throw months of valuable CPU
time to answer questions I already know the answer to.

H. G. Muller wrote on Tue, May 27, 2008 06:14 AM UTC:

Derek:
| The moral of the story is that randomization of move selection 
| reduces  the growth in playing strength that normally occurs with 
| time and plies completed.

This is not how it works. For one, you assume that at long TC there would
be fewer moves to chose from, and they would be farther apart in score.
This is not the case. The average distribution of move scores in the root
depends on the position, not on search depth.

And in cases were the scores of the best and second-best move are far
apart, the random component of the score propagating from the end-leaves
to the root is limited to some maximum value, and thus could never cause
the second-best move to be preferred over the best move. The mechanism can
only have any effect on moves that would score nearly equal (within the
range of the maximum addition) in absence of the randomization.

For moves that are close enough in score to have an effect on, the random
contribution in the end-leaves will be filtered by minimax while trickling
down to the root in such a way that it is no longer a homogeneously
distributed random contribution to the root score, but on average
suppresses scores of moves leading to sub-trees where the opponent had a
lot of playable options, and we only few, while on average increasing
scores where we have many options, and the opponent only few. And the
latter are exactly the moves that, in the long term, will lead you to
positions of the highest score.

H. G. Muller wrote on Tue, May 27, 2008 05:13 PM UTC:

No engine I know of prunes in the root, in any iteration. They might reduce
the depth of poor moves compared to that of the best move by one ply, but
they will be searched in every iteration except a very early one (where
they were classified as poor) to a larger depth then they were ever
searched before. So at any time their score can recover, and if it does,
they are re-searched within the same iteration at the un-reduced depth.

This is absolutely standard, and also how Joker80 works. Selective search,
in engines that do it, is applied only very deep in the tree. Never close
to the root.

10x8 Variants. Missing description[All Comments] [Add Comment or Rating]

H. G. Muller wrote on Thu, May 29, 2008 08:26 AM UTC:

It is a bit misleading to list Capablanca Random Chess (and its Modern variant) here with a fixed array. It would have been more logical to depict an empty board, with the pieces next to it... For completeness, it should at least have mentioned what the restrictions for setting up the pieces are.

Note that most engines able to play Scharnagl's CRC are also capable of playing opening setups he explicitly excludes from being CRC, i.e. with undefended Pawns, with Bishops next to each other, or with Q and A on like colors. They in general consider this all the same variant, 'Capablanca Random Chess', as opening arrays in the program logic are not part of the variant definition, but are simply set by loading a FEN. CRC in some programs is considered a different variant from Capablanca, dure to the different castling rules (like FRC is considered a distinct variant from normal FIDE Chess).

Piece Values[Subject Thread] [Add Response]

H. G. Muller wrote on Sun, Jun 1, 2008 05:24 PM UTC:

George Duke:
| However, the reality is if one is playing many CVs, precisely 
| Number One, not any of the other 3, is far and away the most valuable
| and reliable tool, effectively building on experience. Time is also
| factor, and unless Player can adjust quickly, without extensive
| playtesting, and make ballpark estimates of values, all is lost on 
| new enterprise. We recommend just this Method One, increasing
| facility at it, for serious CV play, and in turn the designer
| needs to try to keep the game somewhat out of reach for Computer.

Well, I guess that it depends on what your standards are. If you are
satisfied with values that are sometimes off by 2 full Pawns, (as the case
of the Archbishop demonstrates to be possible), I guess method #1 will do
fine for you. But, as 2 Pawns is almost a totally winning advantage, my
standards are a bit higher than that. If I build an engine for a CV, I
don't want it to strive for trades that are immediately losing.

H. G. Muller wrote on Tue, Jun 17, 2008 06:33 AM UTC:

Derek:
| Could you please give me example lines within the 'winboard.ini' 
| file that would successfully do so?  I need to make sure every 
| character is correct.

Sorry for the late response; I was on holiday for the past two weeks. The
best way to do it is probably to make the option dependent on the engine
selection. That means you have to write it behind the engine name in the
list of pre-selectable engines like:

/firstChessProgramNames={...
'C:/engines/joker/joker80.exe 23' /firstInitString='new\n'
...
}

And something similar for the second engine, using /secondInitString. The
path name of the joker80 executable would of course have to be where you
installed it on your computer; the argument '23' sets the hash-table
size. you could add other arguments, e.g. for setting the piece values,
there as well. Note the executable name and all engine argument are
enclosed by the first set of quotes (which are double quotes, but these
for some reason refuse to  print in this forum), and everything after this
first syntactical unit on the line is interpreted as WinBoard arguments
that should be used with this engine when it gets selected. Note that
string arguments are C-style strings, enclosed in double quotes, and
making use of escape sequences like '\n' for newline. The defauld value
for the init strings is 'new\nrandom\n'.

H. G. Muller wrote on Tue, Jun 17, 2008 06:51 AM UTC:

George Duke:
| Has initial array positioning already entered discussion for
| value determinations?

No, it hasn't, and I don't think it should, as this discussion is about
Piece Values, and not about positional play. Piece values are by
definition averages over all positions, and thus independent on the
placement of pieces on the board.

Note furthermore that the heuristic of evaluation is only useful for
strategic characteristics of a position, i.e. characteristics that tend to
be persistent, rather than volatile. Piece placement can be such a trait,
but not always. In particular, in the opening phase, pieces are not locked
in the places they start, but can find plenty better places to migrate to,
as the center of the board is still complete no-man's land. Therefore, in
the opening phase, the concept of 'tempo' becomes important: if you waste
too much time, the opponent gets the chance to conquer space, and prevent
your pieces that were badly positioned in the array to properly develop.

I did some asymmetric playtesting for positional values in normal Chess, swapping Knights and Bishops for one side, or Knights and Rooks. I was not able to detect any systematic advantage the engines might have been deriving from this. In my piece value testing I eliminate positionsal influences by playing from positions that are as symmetric as possible given the material imbalance. And the effect of starting the pieces involved in the imbalance in different places is averaged out by playing from shuffled arrays, so that each piece is tried in many different locations.

H. G. Muller wrote on Tue, Jun 17, 2008 05:27 PM UTC:

Well, never mind. The symmetrical playtesting would not have given any
conclusive results with anything less than 2000 games anyway.

The asymmetrical playtesting sounds more interesting. I am not completely
sure what Smirf bug you are talking about, but in the Battle of the Goths
Championship it happened that Smirf played a totally random move when it
could give mate in 3 (IIRC) according to both programs (Fairy-Max was the
lucky opponent). This move blundered away the Queen with which Smirf was
supposed to mate, after which Fairy-Max had no trouble winning with an
Archbishop agains some five Pawns. 

This seems to happen when Smirf has seen the mate, and stored the tree
leading to it completely in its hash table. It is then no longer
searching, and it reports score and depth zero, playing the stored moves
(at least, that was the intention).

I have never seen any such behavior when Smirf was reporting non-zero
search depth, and in particular, the last non-zero-depth score before such
an occurence (a mate score) seemed to be correct. So I don't think there
is much chance of an error when you believe the mate announce,emt and call
the game.

Of course you could also use Joker80 or TJchess10x8, which do not suffer
from such problems.

H. G. Muller wrote on Tue, Jun 17, 2008 09:40 PM UTC:

| However, TJChess cannot handle my favorite CRC opening setup, 
| Embassy Chess, without issuing false 'illegal move' warnings and 
| stopping the game.

Remarkable. I played this opening setup too, in Battle of the Goths, and
never noticed any problems with TJchess. It might have been another
version, though.

If you have somehow saved the game, be sure to send it to Tony, so he can
fix the problem.

H. G. Muller wrote on Wed, Jun 18, 2008 07:24 AM UTC:

OK, I see the problem now. I forgot that the Embassy array is a mirrored
one, with the King starting on e1, rather than f1. And that to avoid any
problems with it in Battle of the Goths, I did not really play Embassy,
but the fully equivalent mirrored Embassy. And with that one, none of the
engines had problems, of course.

Actually it seems that it is not TJchess that is in error here: e1b1 does
seem a legal castling in Embassy. It is WinBoard_F which unjustly rejects
the move. Most likely because of the FEN reader ignoring specified
castling rights for which it does not find a King on f1 and a Rook in the
indicated corner.

The fact that you don't have this problem with Joker80 is because Joker80
is buggy. (Well, not really; it is merely outside its specs. Joker80
considers all castlings with a non-corner Rook and King not in the f-file
as CRC castlings, which are only allowed in variant caparandom, but not in
variants capablanca or *UNSPEAKABLE*. And Joker80 does not support
caparandom yet.) So the fact that you don't see any problems with Joker80
is because it will never castle when you feed it the Embassy setup, so that
WinBoard doesn't get the chance to reject the castling as illegal. And if
the opponent castles, WinBoard would reject it as illegal, and not pass it
on to Joker80.

I guess the fundamental fix will have to wait until I implement variant
caparandom in WinBoard; I think that both WinBoard and Joker80 are correct
in identifying the Embassy opening position as not belonging to Capablanca
Chess, but needing the CRC extension of castling. (Even if it is only a
limited extension, as the Rooks are still in thre corner.) And after I fix
it in WinBoard, I still would have to equip Joker80 with CRC capability
before you could use it to play the Embassy setup.

It is not very high on my list of priorities, though, as I see little
reason to play Embassy rather than mirrored Embassy.

Grotesque Chess. A variant of Capablanca's Chess with no unprotected Pawns. (10x8, Cells: 80) [All Comments] [Add Comment or Rating]

H. G. Muller wrote on Thu, Jun 19, 2008 10:52 AM UTC:

I am still contemplating how to generalize the castling in Joker80. There are two issues there: how to commnicate the move from and to the GUI, and how to indicate the existence of the rights. Currently WinBoard protocol has two mechanisms to set up a position: by loading the engine with a FEN, or (obsolete) through an edit command to enter a list of pieces+squares combinations. The latter mode does not support transmission of castling rights at all, and is only a legacy for backward compatibility with old engines. So for loading position, we only have to provide a mechanism for indicating castling rights in a FEN.

The FRC-type notation only indicates the position of the Rook. The King does not need to be indicated in games where there is only one King, and the positioning of Rook w.r.t. King implies where both will end up. This means we would have to devise some other notation for cases where the King ends elsewhere. I am not sure if it would make sense to generalize so much as to allow castlings where the Rook does not end up next to, and on the other side of the King. There is of course no limit to the craziness of moves that could be called a castling, but one would have to put a limit somewhere, to not fall victim to the 'maximum-flexibility, minimum-usefulness' principle.

I would probably implement it like this: in the castling-rights field of the FEN, the letter indicating the file of the Rook that can castle (which does not necessarily have to be an orthodox Rook, as the FEN makes it obvious what piece is standing there) can be followed by a digit, indicating the number of squares the King ends up away from the corner. The final position of the Rook would be implied by this.

Example: normal King-side castling rights could be indicated by H1. The 1 would be the default (on an 8x8 board), and could be omitted for upward compatibility with Shredder-FEN. In Capablanca Chess the opening would have castling rights A2J1a2j1, equivalent to AJaj (or KQkq). Symmetric castling rights like in Janus Chess would be indicated as A1J1a1j1, or A1Ja1j when deleting the redundant defaults. Multiple castling rights to the same side could exist next to each other: A2A1J2J1a2a1j2j1 would allow short as well as long castling in both directions.

For transmitting the castling moves, one could use King captures own Rook. In games where the same Rook could be used for castlings with multiple King destinations, one could give the King step to its final destination in stead. If this could also be a normal King move, one could append an r as 5th character to identify it as a castling, using the syntax that would otherwise be used for promotions. In PGN one could use similar strategies to indicate non-orthodox castlings, and use suffix =R on a King move to specify castling.

I think this covers most cases encountered in practice. Problems only occur only if there would be multiple castlings with the same Rook, and at the same time castlings with a Rook on the left would have the same King destination as those with a Rook on the right. Because the move notation cannot indicate at the same time which Rook to use and specify where the King should go. But this seems to outlandish to worry about.

To cover cases where K and R do not end up next to each other, we could put a second digit in the FEN castling-rights specifier for the final position of the Rook wrt the corner. (I.e. normal king-side castling = h12.) This obviously could lead to problems on very wide boards, that require multiple digits to specify distance to the corner. So perhaps it is better to separate King and Rook destination by a period (h1.2). Indicating the move would be a problem, as two destinations might have to be specified to unambiguously identify the move (e.g. if all castlings are allowed weher the King steps any number of squares >=2 towards a Rook, and then the Rook can go to any square the King passed over.) One could just specify King and Rook final squares (i.e. O-O = g1f1), but in FRC there is no guarantee that this cannot be a normal move. In which case the 'r' could again be used as 5th character, to indicate castling. In PGN we could reserve a character used in stead of the piece indicator for castlings, say 'O'.

Conclusion: it is difficult to design a notation that would be general and universal; different games seem to need different ways to specify the moves and rights.

H. G. Muller wrote on Thu, Jun 19, 2008 01:35 PM UTC:

Well, one has to think ahead a little bit to keep the road to future extensions open, and not paint oneself into a corner. This is why I tackle a fairly large number of cases at once.

I don't see the unicity of the FEN strings as a serious problem; if the logic behind the various systems would allow a certain castling to be described in multiple ways, one can supply an additional rule to specify which method should be used preferentially. e.g. if K or H could be used to unambiguously specify king-side castling, one should use K. In the FEN reader I would not even pay attention to that, and have it understand both, as this is usually easier.

An important issue is how much effort one should put into upkeeping a unified approach, in which both game state and played variant are unambiguously specified by the FEN. One might wonder if it is sensible to require, say, that a position from Janus Chess and a position from Capablanca Chess should be considered as different positions from the same variant, 'fullchess'. This puts a lot of extra burdon on the FEN:

For indicating game state, the castling rights have to indicate only which pieces moved. Wanting the FEN to specify the castling method, or other aspects of the rules (e.g. if Pawns can promote to Chancellors, or not), might just be asking for trouble.

So perhaps I was overdoing it. It might be more useful to consider variants like Janus or Grotesque as distinct from Capablanca. KQkq could then be used to indicate castling rights in all three cases. Games with more than 2 Rooks could use the Shredder-FEN system without any problem, as long as there is only one King (so that all rights disappear once this King moves). Only in games with multiple Kings AND multiple Rooks there would be a problem.

This only leaves move notation. In particular in variants where a castling to a particular side can be performed in more than one way, like in Grotesque. A very general way to solve this in PGN would be to provide a mechanism to specify moves that displace more than one piece, by joining the moves with an &. So an alternative to write h-side castling in Grotesque could be Ke1-i1&Rj1-h1 (or in short, Ki1&Rh1).

In WinBoard protocol, the moves between engine and GUI are not transmitted in SAN, but simply as FROM and TO square appended to each other, with an optional 5th character to indicate promotion piece (e.g. e7e8q). Perhaps the best system there would be to encode variable castlings by using k or q as the 'promotion' character, to indicate if the K-side or Q-side Rook is to be used, and make the squares indicate the to-square of King and Rook, respectively. These notations would always be recognizable as not indicating promotions, as both the mentioned squares would be on the same rank.

H. G. Muller wrote on Thu, Jun 19, 2008 03:31 PM UTC:

'If the effort isn't too big' is a big if. Normal chess, Capablanca, FRC and CRC are similar enough not to cause too much trouble. Although I consider it already a nasty trait that some of the rules have to be implied by the board size, such as to which pieces a Pawn may promote. If a board width of 10 is taken to imply Chancellor and Archbishop are allowed, the problems with Janus Chess or Chancellor Chess I would consider already pretty bad. To unify those with Capablanca/CRC would require different letters for their Pawns. In Janus Chess you would have to indicate the deviating castling mode amongst the rights.

In the O-i-h system you would always be able to deduce if the K-side or Q-side rook is to be used by the ordering of the King and Rook destination? I guess we could indeed consider it a defining property of a castling that it swaps the order of King and Rook. I am not aware of any exceptions to this, even in FRC/CRC the King is required to be between the two Rooks. So I guess your system is acceptable for in the PGN, with the additional preference rule that if there is only one castling possible to the given side, it would be written as O-O or O-O-O.

The way the move is transmitted between engine and GUI in WB protocol is a matter specific to the WinBoard GUI. And WinBoard does generate the list of allowed moves itself, there is no way in WB protocol to request it from the engine. As this type of castling with multiple King and Rook destinations is about as crazy as they get, anticipating this format would probably enough to cover everything. Even the normal castling requires the GUI to recognize castlings, and know which Rook to move and where. (This caused problems when I had Fairy-Max play Cylinder Chess, as a King crossing the side edge was considered a castling, and led to the displacement of a second piece on the display board!) In fact, with the assumption that the relative orientation of King and Rook destination squares implies which Rook has to be used (and even if there are several Rooks on that side, only the one nearest to the King could be involved in castling), there is no need to convey any information in the 5th character other than that it is a castling. So an O here would be quite convenient, as promotion pieces have to be lower case in WB protocol.

For a really dumb interface (like my battle-of-the-Goth javascript viewer) it is necessary to fully specify from- and to-square of each piece that is moved separately. So there I transmit O-O as e1g1h1f1 and e4xf5 e.p. as e4f5e4f4.

25 comments displayed

⇧Earliest ⇧Earlier ⇧Reverse Order⇩ Later⇧ Latest⇩

Permalink to the exact comments currently displayed.