Single Comment

Chess programs move making[Subject Thread] [Add Response]

H. G. Muller wrote on Tue, Sep 20, 2022 07:05 AM UTC in reply to Aurelian Florea from Mon Sep 19 11:00 AM:

That describes the 'policy head' of the NN, which is used to bias the move choice (which is otherwise based on the number of visits of the move and that of the total for the node, and the move scores) when walking the tree from root to leaf for finding the next leaf to expand. But my understanding was that when the leaf is chosen and expanded, all daughters should receive a score from the 'evaluation head' of the NN in the position after the move, rather than just inheriting their policy weight from the position before the move. These scores are then back-propagated towards the root, by including them in the average score of all nodes in the path to the expanded leaf.