QuickTopic (SM) free message boards QuickTopic (SM) free message boards
Skip to Messages
  Sign In to access your topic list  |New Topic |My Topics|Profile
Upgrade to Pro   Customize, show pictures, add an intro, and more:   QuickTopic Pro...and check out QuickThreadSM
Topic: CSE 151 in Fall 2008
Printer-Friendly Page
All messages    << 121-136  105-120 of 153  89-104 >>
About these ads
Who | When
Messagessort recent-bottom    (not accepting new messages)
Charles Elkan  120
12-07-2008 06:38 PM ET (US)
What is a good way to prevent the soft-max policy from picking illegal actions?
You can exclude these from the softmax calculation, i.e. don't let them have a numerator. Or you can make them have negative infinity Q-value.
Meir Schwarz  119
12-07-2008 05:42 PM ET (US)
Will there be a review session for the final?
Mike Rose  118
12-07-2008 03:11 PM ET (US)
Our Policy Iteration algorithm produces the correct policy but the values are seem to be off by about .05. Is this acceptable or does it hint that we have a slight error in our algorithm? If not, any idea where we should start debugging?
Mike Rose  117
12-07-2008 03:04 PM ET (US)
What is a good way to prevent the soft-max policy from picking illegal actions?
matt  116
12-07-2008 02:57 AM ET (US)
how is it possible for our algorithm to work to find the optimal policy, but not be able to find the alternative optimal policies(it basically finds the opposite of those policies when values in range are entered)?
Chris  115
12-06-2008 06:21 PM ET (US)
Deleted by author 12-06-2008 06:21 PM
Charles Elkan  114
12-05-2008 07:02 AM ET (US)
Edited by author 12-05-2008 07:04 AM
When computing the expected total reward it can grow to almost twice the maximum (2). For example, when the proposed action is to move to the goal field (reward = 1) then r(s, pi(s)) = 1. But in the sum this reward is again taken into account since the goal field is included in s'. V(goal) = 1 and therefore V(goal) * 0.8 * gamma is again added to the total expected reward, right? So I used r(s) instead. I am confused since this is the reward when moving into the current state and the expected total reward starting in this state should not take this r(s) into account, right?
You are right there is a problem here. The fundamental issue is that we must avoid double-counting. Since you have understood the problem, you can solve it (in more than one way) quite easily.

The rewards computed match exactly the values on page 8 of the slide except (1,3) and (1,4) which are slightly different (0.59 instead of 0.61, and 0.37 instead of 0.388). Is it possible that the values in the slides were computed in some other way?
It's possible that something is unspecified in the slides and slightly different in your code. This discrepancy is unlikely to indicate any algorithm bug, so you don't need to track down its cause.
Tobias  113
12-05-2008 01:33 AM ET (US)
Edited by author 12-05-2008 01:40 AM
V(s) = r(s, pi(s)) + sum_s' p(s'/s,pi(s)) gamma V(s')

- When computing the expected total reward it can grow to almost twice the maximum (2). For example, when the proposed action is to move to the goal field (reward = 1) then r(s, pi(s)) = 1. But in the sum this reward is again taken into account since the goal field is included in s'. V(goal) = 1 and therefore V(goal) * 0.8 * gamma is again added to the total expected reward, right? So I used r(s) instead. I am confused since this is the reward when moving into the current state and the expected total reward starting in this state should not take this r(s) into account, right?

- The rewards computed match exactly the values on page 8 of the slide except (1,3) and (1,4) which are slightly different (0.59 instead of 0.61, and 0.37 instead of 0.388). Is it possible that the values in the slides were computed in some other way?
Decision Tree Notes  112
12-04-2008 05:55 PM ET (US)
I posted my decision tree notes:

http://www.cs.ucsd.edu/~knoto/pub/notes/

Please send mail if you have questions.
-Keith
Charles Elkan  111
11-30-2008 01:40 AM ET (US)
I'll be at a conference this coming week. The lectures on Tuesday and Thursday will be given by Dr. Keith Noto, http://www.cs.ucsd.edu/~knoto/

Section will happen on Monday as usual.

I'll try to answer questions here on the message board as usual also. Please ask questions about MDPs!
Charles Elkan  110
11-24-2008 07:12 PM ET (US)
Sample solution available for the homework assignment

See http://www.cs.ucsd.edu/users/elkan/151/assignment2soln.pdf
Charles Elkan  109
11-21-2008 11:58 AM ET (US)
If you like CSE 151, the following courses are highly recommended as further learning.


Announcing COGS 118A and 118B (Natural Computation I and II)

Are you interested in graduate school in machine learning,
computational neuroscience, or computational modeling? Interested in
lucrative jobs using machine learning to solve practical problems?
Consider taking one or both of COGS 118A and 118B (Natural Computation
I and II). These courses give a rigorous background in machine
learning theory and algorithms designed to provide the background we
want our machine learning graduate students to have. A strong math
background is required (Math 20E, 20F, 180A and a prior course in
computer programming are prerequisites) but you may discuss your
individual situation with the instructors. Please note that these
courses can be taken in either order. You do not need to take 118A
before 118B. 118A will be offered in Winter 2009 (Professor Angela
Yu) and 118B in Spring 2009 (Professor Virginia de Sa).


118A. Natural Computation I (4) This course is an introduction to
computational modeling of biological intelligence, focusing on neural
networks and related approaches to SUPERVISED learning. Topics include
estimation, filtering, optimization, neural networks, support
vector machines, Gaussian Processes, Bayes nets. Prerequisites: Cognitive
Science 109 or equivalent, Mathematics 20E, Mathematics 20F, and
Mathematics 180A or consent of instructor.

118B. Natural Computation II (4) This course is an introduction to
computational modeling of biological intelligence, focusing on neural
networks and related approaches to UNSUPERVISED learning. Topics
include density estimation, clustering, self-organizing maps,
principal component analysis, kernel methods, and information
theoretic models. Prerequisites: Cognitive Science 109 or equivalent,
Mathematics 20E, Mathematics 20F, and Mathematics 180A or consent of
instructor.

----------------------------------------------------------- ------
Virginia de Sa desa@ucsd.edu
Department of Cognitive Science ph: 858-822-5095
9500 Gilman Dr. 858-822-2402
La Jolla, CA 92093-0515 fax: 858-534-1128
-------------------------------------------------------------- ---
Charles Elkan  108
11-18-2008 09:58 AM ET (US)
REPRESENTATION, SEARCH, AND THE WEB

Professor Richard K. Belew
>
CogSci 188 - Winter, 2009
>
> Tuesday, Thursday 9:30-10:50a
>
> Warren Lecture Hall 2113
>
> SectID#?? - (4 units)
>
> http://abbey.ucsd.edu:8080/cogs188
>
> Recent estimates suggest that our species is producing five exabytes (5 billion gigabytes) of
"content," each year! But the more content we produce, the harder it often becomes to find anything,
let along anything "relevant." This course will present a survey of computational techniques designed
to represent content, search through it, and use the WWW to connect the new producers of content with
their new audiences. The central focus will be on probabilistic techniques for inferring textual
documents' "meaning'' from word occurrence statistics. Graph analysis techniques applied to
bibliographic citations and the Web, Web crawling, and Web2.0 techniques will also be discussed.
>
> Students will analyze these algorithms mathematically and experiment with their implementation.
Students with and without programming backgrounds will be accomodated. Both graduate and undergraduate
students are welcome.
Charles Elkan  107
11-18-2008 01:10 AM ET (US)
/m106: Likelihoods are positive numbers very close to zero. They should be increasing with each iteration of EM.
Meir Schwarz  106
11-18-2008 12:49 AM ET (US)
If I'm seeing positive numbers that are progressively getting smaller am I computing likelihood instead of log likelihood?
Charles Elkan  105
11-18-2008 12:46 AM ET (US)
/m102: Yes, there is a different likelihood for each document. Because this is based on a pmf (not pdf) it is a number between 0 and 1 that is very close to zero in practice. The sum of all the log likelihoods is not one. It is a large negative number that should be increasing, i.e. getting less negative. For how to check if you have maximized the sum of log likelihoods, see /m104.
RSS link What's this?
All messages    << 121-136  105-120 of 153  89-104 >>
QuickTopicSM message boards
Over 200,000 topics served
Learn more Frequently asked questions  Acknowledgements
What they're saying about QuickTopic
 Questions, comments, or suggestions? Contact Us
Read our use policy before beginning. We value your privacy; please read our privacy statement.
Copyright ©1999-2008 Internicity Inc. All rights reserved.