Dave Kauchak
|
1
|
 |
|
04-25-2002 03:37 AM ET (US)
|
|
Edited by author 04-25-2002 03:38 AM
I hate to be the one always complaining about the writing of these papers, but I found this paper particularly difficult to digest because of the writing. Some of this may arise because the paper seems to be at the intersection of reinforcement learning and game theory.
I think one of my main complaints is vocabulary/notation. The paper uses 'observations' instead of 'states'. Then, to make matters work, the authors use s to abbrevate sources. This makes many of the equations look odd, given that most reinforcement learning text uses s as the abbreviation for state. On top of all this, the problems in the experimental section involve states of a seemingly different sort.
Also, in a number of places the paper seems to try and present things in chronological order, which tends to add to the confusion.
Beyond the writing, I also have a couple of questions. Is there a better or simpler method to solve some of these problems? For example, the paper sites that sources might knowingly adjust scores for personal gain. In this situation, it seems to me that if we simply impose the constraint that sources don't have knowledge (or minimal knowledge) of other sources and we simply scale the rewards from these sources then we can get the desired result.
Is there a reason that the paper chose the particular set of problems to experiment with? Although I have not done an extensive search of reinforcement learning text, I have never encountered these particular problems. Could we simply use the grid problem with multiple goals? It seems to me that using a more common test problem would be useful not only for familiarity, but credibility.
|