Comments/Ratings for a Single Item
| I just cannot understand how any rational, intelligent man could | believe that introducing chaos (i.e., randomness) is beneficial | (instead of detrimental) to achieving a goal defined in terms of | filtering-out disorder to pinpoint order. It would be very educational then to get yourself acquainted with the current state of the art of Go programming, where Monte-Carlo techniques are the most successful paradigm to date... | When you reduce the power of your algorithm in any way to | filter-out inferior moves, you thereby reduce the average | quality of the moves chosen and consequently, you reduce | the playing strength of your program- esp. at long time controls. Exactly. This is why I _enhance_ the power of my algorithm to filter out inferior moves. As the inferior moves have a smaller probability to draw a large positive random bonus than the better moves. They thus have a lower probability to be chosen, which enhances the average quality of the moves, and thus playing strength. At any time control. It is a pity this suppression of inferior moves is only probabilistic, and some inferior moves by sheer luck can still penetrate the filter. But I know of no deterministic way to achieve the same thing. So something ais better as nothing, and I settle for the inferior moves only getting a lower chance to pass. Even if it is not a zero chance, it is still better than letting them pass unimpededly. | In any event, the addition of the completely-unnecessary module of | code used to create the randomization effect within Joker80 that | you desire irrefutably makes your program larger, more complicated | and slower. Can that be a good thing? Everything you put into a Chess engine makes it larger and slower. Yet, taking almost everything out, only leaves you with a weak engine like micro-Max 1.6. The point is that putting code in also can make the engine smarter, improve its strategic understanding, reduce its branching ratio, etc. So if it is a good thing or not does not depend on if it makes the engine larger, motre complicated, or slower. It depends on if the engine still fits in the available memory, and from there produces better moves in the same time. Which larger, more complicated and slower engines often do. As always, testing is the bottom line. Actually the 'module of code' consists only of only 6 instructions, as I derive the pseudo-random number from the hashKey. But the point you are missing is this: I have theoretical understanding of how Chess engines work, and therefore are able to extrapolate their behavior with high confidence from what I observe under other conditions (i.e. at fast TC). Just like I don't have to travel to the Moon and back to know its distance from the Earth, because I understand geometry and triangulation. So I know that if including a certain evaluation term gives me more accurate scores (and thus more reliable selection of the best move) from 8-ply search trees, I know that this can only give better moves from 18-ply search trees. As the latter is nothing but millions of 8-ply search trees grafted on the tips of a mathematically exact 10-ply minimax propagation of the score from the 8-ply trees towards the root. Anyway, it is not of any interest to me to throw months of valuable CPU time to answer questions I already know the answer to.
'It would be very educational then to get yourself acquainted with the current state of the art of Go programming ...' Go is a connection game that is not related to Chess or its variants. The only thing Go has in common with Chess is that it is played upon a board using pieces. You did not directly address my comment.
Rest assured, I intend to drop this futile topic of conversation soon and leave you alone. The following is my impression of how the limited randomization of move selection that you have described as being at work within Joker80 must be harmful to the quality of moves made (on average) at long time controls. Since you have experience and knowledge as the developer of Joker80, I will defer to you the prerogative to correct errors in my inferred, general understanding of its workings. _______________________________________________________ short time control 1x At an example time control of 10 seconds per move (average), Joker80 cuts thru 8 plies before it runs out of time and must produce a move. At the moment the time expires, it has selected 12 high-scoring moves as candidates out of a much larger number of legal moves available. Generally, all of them score closely together with a few of them even tied for the same score. So, when Joker80 randomly chooses one move out of this select list, it has probably not chosen a move (on average) that is beneath the quality of the best move it could have found (within those severe time constraints) by anything except a minor amount. In other words, the damage to playing strength via randomization of move selection is minimized under minimal time controls. ___________________________ long time control 360x At an example time control of 60 minutes per move (average), Joker80 cuts thru 14 plies (due to its sophisticated advance pruning techniques) before it runs out of time and must produce a move. At the moment the time expires, it has selected only 4 high-scoring moves as candidates out of a much larger number of legal moves available. Generally, all of them score far apart with a probable best move scored significantly higher than the probable second best move. So, when Joker80 randomly chooses one move out of this select list, the chances are 3/4 that it has ignored its probable best move. Furthermore, it may not have chosen the probable second best move, either. It just as likely could have chosen the probable third or fourth best move, instead. Ultimately, it has probably chosen a move (on average) that is beneath the quality of the best move it may have successfully found by a moderate-major amount. In other words, the damage to playing strength via randomization of move selection is maximized under maximal time controls. _______________________________________ The moral of the story is that randomization of move selection reduces the growth in playing strength that normally occurs with time and plies completed.
I'm not very familiar with H.G.'s randomization technique, so I really have no idea how well it works. It sounds like he adds small random values to leaf node evaluations, which is of course different than selecting a random 'good' move from the root of the search. Note that it is definitely true that randomness can be helpful for a chess engine, even though it might seem counter-intuitive. For example, basically all strong chess engines (as far as I know) use random (pseudo-random) Zobrist keys for hashing. The random keys may be generated at run-time, or pre-generated, but they are random either way. Using different random keys will cause the engine to give slightly different results without necessarily changing the engine's overall strength. Obviously, if used incorrectly, randomness could severely hurt an engine's strength as well. For example, if an engine just plays random moves. :)
Derek: | The moral of the story is that randomization of move selection | reduces the growth in playing strength that normally occurs with | time and plies completed. This is not how it works. For one, you assume that at long TC there would be fewer moves to chose from, and they would be farther apart in score. This is not the case. The average distribution of move scores in the root depends on the position, not on search depth. And in cases were the scores of the best and second-best move are far apart, the random component of the score propagating from the end-leaves to the root is limited to some maximum value, and thus could never cause the second-best move to be preferred over the best move. The mechanism can only have any effect on moves that would score nearly equal (within the range of the maximum addition) in absence of the randomization. For moves that are close enough in score to have an effect on, the random contribution in the end-leaves will be filtered by minimax while trickling down to the root in such a way that it is no longer a homogeneously distributed random contribution to the root score, but on average suppresses scores of moves leading to sub-trees where the opponent had a lot of playable options, and we only few, while on average increasing scores where we have many options, and the opponent only few. And the latter are exactly the moves that, in the long term, will lead you to positions of the highest score.
I have read that most computer chess programmers use the brute force method initially when the plies can be cut thru quickly and then switch to use advanced pruning techniques to focus the search from then on. This lead to my mis-interpretation that Joker80 would have more moves under consideration as the best at short time controls than long time controls. Some moves that score highly-positive after only a few-several plies will score lowly-positive, neutral or negative after more plies. Thus, I do not see how the number of moves under consideration as the best could prevent being reduced slightly with plies completed. As a practical concern, there is rarely any benefit in accepting the CPU load associated with, for example, checking a low-score positive move returned after 13-ply completion thru 14-ply completion (for example) when other high-score positive moves exist in sufficient number.
No engine I know of prunes in the root, in any iteration. They might reduce the depth of poor moves compared to that of the best move by one ply, but they will be searched in every iteration except a very early one (where they were classified as poor) to a larger depth then they were ever searched before. So at any time their score can recover, and if it does, they are re-searched within the same iteration at the un-reduced depth. This is absolutely standard, and also how Joker80 works. Selective search, in engines that do it, is applied only very deep in the tree. Never close to the root.
''Educated guessing based on known 8x8 piece values and assumptions on synergy values of compound pieces'' -- [immediately] Muller rejects it out of hand from his list of four 3.May.2008. ''We can safely dismiss method (1) as unreliable...'' Then he touts the more scientific, roughly: 2) board-averaged piece mobilities 3) best fit from computer-computer games deliberately imbalanced 4) Playtesting. However, the reality is if one is playing many CVs, precisely Number One, not any of the other 3, is far and away the most valuable and reliable tool, effectively building on experience. Time is also factor, and unless Player can adjust quickly, without extensive playtesting, and make ballpark estimates of values, all is lost on new enterprise. We recommend just this Method One, increasing facility at it, for serious CV play, and in turn the designer needs to try to keep the game somewhat out of reach for Computer.
From what I have seen in regard to both variants and programmers, it seems logical to conclude that any game a human mind can play, a program can be written for. The program may be flawed, but the bugs can be worked out.
In my opinion, designers need not worry about computers. If you make a great game, likely someone will get a computer to play it. That is not to say all great games end up having associated programs... but they could.
George Duke: | However, the reality is if one is playing many CVs, precisely | Number One, not any of the other 3, is far and away the most valuable | and reliable tool, effectively building on experience. Time is also | factor, and unless Player can adjust quickly, without extensive | playtesting, and make ballpark estimates of values, all is lost on | new enterprise. We recommend just this Method One, increasing | facility at it, for serious CV play, and in turn the designer | needs to try to keep the game somewhat out of reach for Computer. Well, I guess that it depends on what your standards are. If you are satisfied with values that are sometimes off by 2 full Pawns, (as the case of the Archbishop demonstrates to be possible), I guess method #1 will do fine for you. But, as 2 Pawns is almost a totally winning advantage, my standards are a bit higher than that. If I build an engine for a CV, I don't want it to strive for trades that are immediately losing.
Upon reflection, I have no conceivable reason to be distrustful of using Joker80 IF I shut-off its limited randomization of move selection which Winboard F activates by default. Could you please give me example lines within the 'winboard.ini' file that would successfully do so? I need to make sure every character is correct.
Let's open the discussion to designers not actively programming now. There is a lot of two-tracking in CVPage. One example of double track is that most designers see their work as art and become prolificists (Betza, Gilman), the more ''paintings'' in their portfolio the better; whereas few others want to replace standard FIDE form logically (Duke, Trice, FischerRandom, Duniho's Eurasian, Joyce's Great Shatranj). The two camps talk at cross-purposes. Two other(different) opposite tracks may be seen this thread, namely, between player and programmer. Staightforward heuristic for player (usually designer too hereabouts), to make ongoing alterable piece-value estimates, certainly refining if possible to within 0.1 of a Pawn, their being so many hundreds of CVs to compute, of course will not do in itself for programmer. It is interesting, that's all, that the player's recipe is rejected immediately by the programmer. Player would gravitate to '1)' and '4)' rather than programmer-popular '2)' and '3)'. Another topic to relate here is proven fallacy after 400 years of emphasized Centaur(BN) and Champion (RN) anyway, discussed much in 2007, to be resurrected in follow-up. // In response to Gifford's: Computers will never write rhymed lines this century where every syllable matches in rhyme like: ''The avatar Horus' all-seeing Eye/ We have a star-chorus rallying cry.'' Granted most would not like style of writing, but still Computer cannot do it, rhyme every word with meaning. Similarly, we need games Computer cannot play well, or be expected to ever, using hidden information like Kriegspiel if that's what it takes, or Rules changing within score, or something else. Surely the main reason for vanishing interest in Mad Queen is Computer dominance in all aspects.
With Centaur and Champion(RN) the array must affect values on 8x10 especially. Detraction of 0.1 or more for both cornered, one would expect. In Falcon Chess, of the 453,600 initial arrays, cornered positions for Falcon lower value relatively. Cheops' 'FRNBQ...' or Pyramids' 'FBRNQ...' each take away 0.1 or 0.2 of more general 6.5. Templar's 'RBFNQ...' and Osiris' 'RNBQFF...' are harder to distinguish from standard 'RNBFQ...' and 'RNFBQ...' Has initial array positioning already entered discussion for value determinations?
I bet if you offered a $20,000 reward we'd see many programs coming to meet the poetic challenge within a matter of months. You can read about computer generated writing here:
http://www.evolutionzone.com/kulturezone/c-g.writing/index_body.html
Anyway, I believe that computers are up to such a poetic task... it just takes a motivated programmer.
Back to CVs: Chess is a great game. And just because computers can play it far better than most, are we to discard it? I don't think so; not as long as humans vs. humans and enjoy the game while doing so. The same goes with other variants.
As for the poetry, just because computers don't write that style certainly doesn't motivate me to do so.
Derek: | Could you please give me example lines within the 'winboard.ini' | file that would successfully do so? I need to make sure every | character is correct. Sorry for the late response; I was on holiday for the past two weeks. The best way to do it is probably to make the option dependent on the engine selection. That means you have to write it behind the engine name in the list of pre-selectable engines like: /firstChessProgramNames={... 'C:/engines/joker/joker80.exe 23' /firstInitString='new\n' ... } And something similar for the second engine, using /secondInitString. The path name of the joker80 executable would of course have to be where you installed it on your computer; the argument '23' sets the hash-table size. you could add other arguments, e.g. for setting the piece values, there as well. Note the executable name and all engine argument are enclosed by the first set of quotes (which are double quotes, but these for some reason refuse to print in this forum), and everything after this first syntactical unit on the line is interpreted as WinBoard arguments that should be used with this engine when it gets selected. Note that string arguments are C-style strings, enclosed in double quotes, and making use of escape sequences like '\n' for newline. The defauld value for the init strings is 'new\nrandom\n'.
George Duke: | Has initial array positioning already entered discussion for | value determinations? No, it hasn't, and I don't think it should, as this discussion is about Piece Values, and not about positional play. Piece values are by definition averages over all positions, and thus independent on the placement of pieces on the board. Note furthermore that the heuristic of evaluation is only useful for strategic characteristics of a position, i.e. characteristics that tend to be persistent, rather than volatile. Piece placement can be such a trait, but not always. In particular, in the opening phase, pieces are not locked in the places they start, but can find plenty better places to migrate to, as the center of the board is still complete no-man's land. Therefore, in the opening phase, the concept of 'tempo' becomes important: if you waste too much time, the opponent gets the chance to conquer space, and prevent your pieces that were badly positioned in the array to properly develop. I did some asymmetric playtesting for positional values in normal Chess, swapping Knights and Bishops for one side, or Knights and Rooks. I was not able to detect any systematic advantage the engines might have been deriving from this. In my piece value testing I eliminate positionsal influences by playing from positions that are as symmetric as possible given the material imbalance. And the effect of starting the pieces involved in the imbalance in different places is averaged out by playing from shuffled arrays, so that each piece is tried in many different locations.
Muller: Thank you for the helpful response. Frankly, I considered my own question so obvious as to be borderline-stupid but I just wanted to be certain. The following entries within the 'winboard.ini' file should enable me to playtest (limited) randomized and non-randomized versions of Joker80 against one another. Does it look alright? If/When I run out of more pressing playtesting missions, I may undertake this one after all. /firstChessProgramNames={'Joker80 22' /firstInitString='new\n' 'Joker80 22' } /secondChessProgramNames={'Joker80 22' /secondInitString='new\n' 'Joker80 22' } Unfortunately, I no longer plan to playtest sets of CRC piece values by Muller, Scharnagl and Nalls against one another. I think having the pawn set to 85 and the queen set to 950 (as required by Joker80) for all three sets of material values would have the unintentional side effect of equalizing their scales (which are normally different). This means that the Muller set would, in fact, be tested against something other than a true, accurate representation of the Scharnagl and Nalls sets. I am currently in the midst of conducting several 'minimized asymmetrical playtests' using SMIRF at moderate time controls. I want to tentatively determine who is correct in disagreements between our models involving 2:1 or 1:2 exchanges (with supreme pieces). I have to avoid its checkmate bug, though. This requires me to take back one move whenever the program declares checkmate and 'call the game' if a sizeable material and/or positional advantage indisputably exists for one player. Fortunately, this is almost always the case. I will give a report in a few-several weeks.
Well, never mind. The symmetrical playtesting would not have given any conclusive results with anything less than 2000 games anyway. The asymmetrical playtesting sounds more interesting. I am not completely sure what Smirf bug you are talking about, but in the Battle of the Goths Championship it happened that Smirf played a totally random move when it could give mate in 3 (IIRC) according to both programs (Fairy-Max was the lucky opponent). This move blundered away the Queen with which Smirf was supposed to mate, after which Fairy-Max had no trouble winning with an Archbishop agains some five Pawns. This seems to happen when Smirf has seen the mate, and stored the tree leading to it completely in its hash table. It is then no longer searching, and it reports score and depth zero, playing the stored moves (at least, that was the intention). I have never seen any such behavior when Smirf was reporting non-zero search depth, and in particular, the last non-zero-depth score before such an occurence (a mate score) seemed to be correct. So I don't think there is much chance of an error when you believe the mate announce,emt and call the game. Of course you could also use Joker80 or TJchess10x8, which do not suffer from such problems.
'Of course you could also use Joker80 or TJchess10x8, which do not suffer from such problems.' ____________________ While you were on vacation, I started a series of 'minimized asymmetrical playtests' using SMIRF. So, I will complete them using SMIRF. Joker80, running under Winboard F, has never acted buggy in computer vs. computer games. However, TJChess cannot handle my favorite CRC opening setup, Embassy Chess, without issuing false 'illegal move' warnings and stopping the game.
| However, TJChess cannot handle my favorite CRC opening setup, | Embassy Chess, without issuing false 'illegal move' warnings and | stopping the game. Remarkable. I played this opening setup too, in Battle of the Goths, and never noticed any problems with TJchess. It might have been another version, though. If you have somehow saved the game, be sure to send it to Tony, so he can fix the problem.
Hecker: It was fairly easy for me to replicate the bug I experienced. In fact, I have never successfully played a computer vs. computer game to completion using TJChess10x8 in my life. So, you should be able to replicate the bug I experienced using the information I have provided. I hope you can fix it as well. Bug Report TJChess10x8 http://www.symmetryperfect.com/report
OK, I see the problem now. I forgot that the Embassy array is a mirrored one, with the King starting on e1, rather than f1. And that to avoid any problems with it in Battle of the Goths, I did not really play Embassy, but the fully equivalent mirrored Embassy. And with that one, none of the engines had problems, of course. Actually it seems that it is not TJchess that is in error here: e1b1 does seem a legal castling in Embassy. It is WinBoard_F which unjustly rejects the move. Most likely because of the FEN reader ignoring specified castling rights for which it does not find a King on f1 and a Rook in the indicated corner. The fact that you don't have this problem with Joker80 is because Joker80 is buggy. (Well, not really; it is merely outside its specs. Joker80 considers all castlings with a non-corner Rook and King not in the f-file as CRC castlings, which are only allowed in variant caparandom, but not in variants capablanca or *UNSPEAKABLE*. And Joker80 does not support caparandom yet.) So the fact that you don't see any problems with Joker80 is because it will never castle when you feed it the Embassy setup, so that WinBoard doesn't get the chance to reject the castling as illegal. And if the opponent castles, WinBoard would reject it as illegal, and not pass it on to Joker80. I guess the fundamental fix will have to wait until I implement variant caparandom in WinBoard; I think that both WinBoard and Joker80 are correct in identifying the Embassy opening position as not belonging to Capablanca Chess, but needing the CRC extension of castling. (Even if it is only a limited extension, as the Rooks are still in thre corner.) And after I fix it in WinBoard, I still would have to equip Joker80 with CRC capability before you could use it to play the Embassy setup. It is not very high on my list of priorities, though, as I see little reason to play Embassy rather than mirrored Embassy.
H.G.M.: '... This move blundered away the Queen with which Smirf was supposed to mate, after which Fairy-Max had no trouble winning with an Archbishop agains some five Pawns. ...' Well, there still is a mating bug within SMIRF. Though I have improved SMIRF's behavior near mating situations (please request a copy of this unpublished engine if needed for key owners' testings), it still seems to be there. There might be a minimal chance, that it sometimes could be caused by a communication problem using the adapter. But I am still convinced, that it is caused by a internal bad evaluation storing design, which hopefully would be corrected in Octopus sometimes ...
Using the mirror of Embassy Chess as a *.fen, TJChess10x8 runs fine now under Winboard F. Thanks!
25 comments displayed
Permalink to the exact comments currently displayed.