| Greg Hamerly
|
2
|
 |
|
04-04-2002 02:27 AM ET (US)
|
|
This paper was a fun read. One thing I'm unclear on is what iterations 1-6 represent in figures 6/7. Are they the recursion depth to which they allows the reinforcement learning to proceed? Also, how is "life time profit" (figure 6) different than "total profit" (figure 7) -- since the sarsa curves look the same in both.
I do think the most intriguing thing is shown in figure 8, where the sarsa-RL approach conserves its mailings, showing a rather intelligent long-term approach.
|