| Who | When |
Messages | |
|
|
|
|
|
| Hector Jasso
|
7
|
 |
|
05-05-2001 09:00 PM ET (US)
|
|
Joe, Peter M. Todd and Bernhard Borges followed on the idea with a paper called "Designing Socially Intelligent Agents For The Ultimatum Game." The paper can be found here: http://www.mpib-berlin.mpg.de/DOK/full/tod...soci/tpdessoci.html(I especially recommend reading the first paper. They run a more sophisticated model with many more variants of limited reasoning abilities for the agent.) They seem to have presented more work related to this, but I can't find where it was published. Try: http://bucky.stanford.edu/cef97/abstracts/borges2.html(It is interesting what they say in the abstract: "In particular, our results suggest that evolution will find and use only those reasoning abilities that aid RL.") If anybody finds the paper, please tell me! Hector
|
| Joe Drish
|
6
|
 |
|
05-02-2001 02:59 PM ET (US)
|
|
I agree with the common themes of the comments: 1) the paper presents a good idea, and 2) the paper is poorly written. Even though the paper takes an abstract approach, I think it would have been useful to include more simulations. I agree with Bianca that the author abuses the use of footnotes, since most of them could have very easily been included in the main text of the paper.
Also, I was wondering how this paper could be built upon, i.e., what is the next step for research given the main conclusion in the paper.
|
| Bianca Zadrozny
|
5
|
 |
|
05-02-2001 12:59 PM ET (US)
|
|
Edited by author 05-02-2001 01:15 PM
I agree with Sameer that the paper uses overly complicated language to explain a concept, that although interesting, is not that difficult to understand. Is it really necessary to spend ten pages explaining that the asymmetry of reinforcement leads to poorer performance by a player that uses virtual reinforcement?
|
| Kristin Branson
|
4
|
 |
|
05-02-2001 12:11 PM ET (US)
|
|
I liked the main idea of this paper -- that the virtual reinforcement is not helpful is somewhat counterintuitive, yet very simple to see once the asymmetry of the reinforcement is explained. I am wondering how well this applies to other problems, though. The asymmetry in the information agent A receives stems from the asymetry of the problem -- one action gains A some amount while another action gains A nothing. Of course, the asymmetry would still be present if A gains a larger amount from one action and a smaller amount from another, if the same sort of reinforcement is used. However, I think that even in a problem as simple as this, there is more reasoning that could be performed than that assumed. For example, if one is assuming an adaptive agent B, then agent A could analyze how his actions would train B. The asymmetry stems ultimately from having just one measure of the goodness of an action, and perhaps this can be avoided in many problems.
I think this is a good example of what Professor Elkan was mentioning on Monday, about how humans actually make poorer decisions than machines in some cases.
|
| sameer agarwal
|
3
|
 |
|
05-02-2001 10:11 AM ET (US)
|
|
Edited by author 05-02-2001 11:43 AM
hi, apart from the unnecessarily complicated language at some points, I think the paper brings about a very interesting point about asymmetry of information and more importantly the need to break assumptions and ideas down to their bare bones and investigating them. The kind of non-game-theoretically-optimal behaviour shown by this paper is very similar to the work in experimental economics using zero intelligence (bounded rationality in an extreme) traders in a stockmarket which shows many of the same aggregate behaviour as human traders in the same market environments.
I like the last section the best, where the author after spending time all over finally makes it clear what is is saying and what he is NOT saying. Perhaps some of these comments could have been included at the beginning of the article, it would have made reading and understanding the article much simpler.
sameer
|
| Dave Kauchak
|
2
|
 |
|
05-02-2001 01:14 AM ET (US)
|
|
Edited by author 05-02-2001 03:11 AM
I thought this was a very intriguing paper. The idea that they show is fairly simple but slightly counterintuitive. Throughout my reading of the paper, I wanted to contradict what they were presenting. However, the paper provides such a simplistic model that it is difficult not too agree with the authors. The result is a good one and one that I think has shown its face in other facets of AI, which is more reasoning capabilities (which could also appear as heuristics) is not always better.
One area where I felt the paper could have expanded a bit more was section 4. The paper did such a nice job of simplifying the model in section 3 so that one could reason about it without too much difficulty. But, when the paper continues on in section 4 to relax the conditions of the game and show a more general idea, I found myself less convinced. The paper seems to quickly gloss over many of the relaxations and calls them harmless assumptions. Although I still tend to agree with their findings, I think the argument could have been made more concrete with more detailed arguments in section 4.
Dave
|
| Melanie Dumas
|
1
|
 |
|
05-01-2001 11:18 PM ET (US)
|
|
Overall, I found the paper interesting, particularly with the use of 'virtual reinforcement learning'. Often game playing agents neglect to analyze their opponent's strategy, and this paper deals with the issue by cleanly incorporating the other player's decisions.
However, I was a little disappointed that the algorithm did not explicitly deal with sequences of moves over time. A related game is the 'Prisioner's Dilemma' where players select whether to coordinate or defect against one another and get points based on a payoff matrix. In this game, one of the best strategies is 'tit-for-tat', where each agent replies with the last move his opponent made. This notion of time, or sequences of moves would be interesting to analyze with the Ultimatum Game.
|