| Gyozo Gidofalvi (Victor)
|
5
|
 |
|
05-07-2001 12:39 PM ET (US)
|
|
I really liked the idea of applying reinforcement learning to automatic strategy acquisition for the game "Othello". The success of reinfocment learning in this domain is not surprising to me. Eventhough, the authors clearly stated that RL players initially only knew the game rules, i consider the min-max strategy a general heuristic for a whole class of games.
Although the notation used in the paper was more or less consistent, i found it sometimes confusing. The min-max strategy could have been stated simply as: "At game step t, chose the move that maximizes the evaluation function value at step t+1, considering the "worst" possible move of the opponent." Similarly, i have the same problem with the intuitive meaning of the reinforcment signal Vt, when t != tfin.
Like Bianca, i was missing the error bars from the graphs also. But the results for the 0-game for the corectness graph (figure 5) imply that the results are statistically signifficant at least for that method. Also, it would have been interesting to see the behavior of the strategies in earlier moves in the game. Such an analyzis may have given better insight into the effects of the intermediate reinforcment signal used.
Overall, i found the paper interesting, but agree with the comments made about the need for evaluation against better players (strategies evolved through GP for example).
|