QuickTopic (SM) free message boards QuickTopic (SM) free message boards
Skip to Messages
  Sign In to access your topic list  |New Topic |My Topics|Profile
Upgrade to Pro   Customize, show pictures, add an intro, and more:   QuickTopic Pro...and check out QuickThreadSM
Topic: Balancing multiple sources of reward in reinforcement learning
Views: 280, Unique: 191 
Subscribers: 0
What's
this?
Printer-Friendly Page
Subscribe to get & post, or stop messages by email Subscribe
All messages    << 7-10  6-6 of 10  1-5 >>
About these ads
Who | When
Messagessort recent-top   
Post a new message
 
Dana Dahlstrom  6
04-25-2002 03:56 PM ET (US)
Dave: I think it makes sense to distinguish between  observations
and states when the environment is only partially observable; the
observation is just the visible part of the state.

It's odd to me that none of the scenarios in  the  figures  looks
like  a  Nash equilibrium, yet the authors write of both examples
that  ``the  algorithm  consistently  settled  on  the   solution
shown''.  For  some  reason,  though, the policy selected for the
first example is only ``approximately uniform''. Why?

The second example seems to have  even  stranger  irregularities.
Shouldn't  the  desired  policies,  votes,  and  resultant policy
reflect the symmetry in the state diagram? Why then, for example,
do  both sources agree the agent should move left in states 2 and
3, but not that it should move right in states 7 and 8? Not  only
do  the  authors not explain this, but they completely neglect to
mention it.
RSS link What's this?
All messages    << 7-10  6-6 of 10  1-5 >>
QuickTopicSM message boards
Over 200,000 topics served
Learn more Frequently asked questions  Acknowledgements
What they're saying about QuickTopic
 Questions, comments, or suggestions? Contact Us
Read our use policy before beginning. We value your privacy; please read our privacy statement.
Copyright ©1999-2008 Internicity Inc. All rights reserved.