QuickTopic (SM) free message boards QuickTopic (SM) free message boards
Skip to Messages
  Sign In to access your topic list  |New Topic |My Topics|Profile
Upgrade to Pro   Customize, show pictures, add an intro, and more:   QuickTopic Pro...and check out QuickThreadSM
Topic: Balancing multiple sources of reward in reinforcement learning
Views: 279, Unique: 190 
Subscribers: 0
What's
this?
Printer-Friendly Page
Subscribe to get & post, or stop messages by email Subscribe
All messages    << 2-10  1-1 of 10        
About these ads
Who | When
Messagessort recent-top   
Post a new message
 
Dave KauchakPerson was signed in when posted  1
04-25-2002 03:37 AM ET (US)
Edited by author 04-25-2002 03:38 AM
I hate to be the one always complaining about the writing of these papers, but I found this paper particularly difficult to digest because of the writing. Some of this may arise because the paper seems to be at the intersection of reinforcement learning and game theory.

I think one of my main complaints is vocabulary/notation. The paper uses 'observations' instead of 'states'. Then, to make matters work, the authors use s to abbrevate sources. This makes many of the equations look odd, given that most reinforcement learning text uses s as the abbreviation for state. On top of all this, the problems in the experimental section involve states of a seemingly different sort.

Also, in a number of places the paper seems to try and present things in chronological order, which tends to add to the confusion.

Beyond the writing, I also have a couple of questions. Is there a better or simpler method to solve some of these problems? For example, the paper sites that sources might knowingly adjust scores for personal gain. In this situation, it seems to me that if we simply impose the constraint that sources don't have knowledge (or minimal knowledge) of other sources and we simply scale the rewards from these sources then we can get the desired result.

Is there a reason that the paper chose the particular set of problems to experiment with? Although I have not done an extensive search of reinforcement learning text, I have never encountered these particular problems. Could we simply use the grid problem with multiple goals? It seems to me that using a more common test problem would be useful not only for familiarity, but credibility.
RSS link What's this?
All messages    << 2-10  1-1 of 10        
QuickTopicSM message boards
Over 200,000 topics served
Learn more Frequently asked questions  Acknowledgements
What they're saying about QuickTopic
 Questions, comments, or suggestions? Contact Us
Read our use policy before beginning. We value your privacy; please read our privacy statement.
Copyright ©1999-2008 Internicity Inc. All rights reserved.