Dave Kauchak
|
2
|
 |
|
04-23-2002 04:16 AM ET (US)
|
|
I thought the ideas of the paper were good and was fairly impressed with the results, however, I found the writing of the paper to be a bit inhibitive for understanding the ideas clearly.
The paper tended to use an interesting vocabulary, such as "extract information" and numerour references to "behavior" with no obvious definition. Most of the time the meaning could be inferred, but it left an imprecise feeling.
The introduction and particularly the abstract were somewhat uninformative about the paper.
I think the paper could have used some examples to help clarify some of the ideas. For example, the paper states that there are many situation where R0 is known, but not the transition probabilities. A list of a few cases would help make the descriptions more understandable.
Finally, one key assumption that is implied, but I think is left out is that the mentor and the observer have the same goal(s). I found this a bit confusing initially, particularly given that the mentor and the observer don't have to have the same reward function.
|