top bar
QuickTopic free message boards logo
Skip to Messages

TOPIC:

CSE 291, Spring 2008

(not accepting new messages)
  Messages 13-12 deleted by author 07-23-2009 12:59 PM
11
Charles Elkan
05-12-2008
05:34 PM ET (US)
This report can be an extended version of your project proposal. Include a description of what you have achieved so far, any obstacle you have overcome, etc. Feel free to putthe new text in italics to distinguish it from your proposal.

Ideally the progress report will be a preliminary version of your final report. The progress report can have results missing, but the framework for the results should be there, along with a description of your methods and implementation.









On Mon, 12 May 2008, QT - Justin Oysad wrote:

>
< replied-to message removed by QT >
10
Justin Oysad
05-12-2008
05:15 PM ET (US)
Sounds pretty cool Jerry! I'm going to implement the Web based kernel I talked about in class, plus experiment with using different Search Engines as sources for this kernel.

I understand that there is a progress report due tomorrow (tuesday) on 2-3 pages. I believe I could summarize my progress in a short paragraph, so I was wondering what else might be interesting to include in this report. Is detailed implementation description interesting (could be recycled in the report afterwards)?

Thanx for any tips and tricks :)
9
Charles Elkan
05-09-2008
02:53 PM ET (US)
Everyone should feel free to post his/her project proposal,
and to provide feedback on posted proposals.

Charles




On Fri, 9 May 2008, QT - Jerry Fu wrote:

>
< replied-to message removed by QT >
8
Jerry Fu
05-09-2008
02:46 PM ET (US)
Out of curiosity, what are other people working on for their projects?

I have been downloading messages off of Yahoo's finance message boards. I am going to implement Topic Sentiment mixture modeling on these messages, as described here:
http://www2007.org/htmlpapers/paper680/. It will be interesting to see what kind of results come from this message set, and to see if this model is a viable way to filter the messages that are on Yahoo's message boards. There are many off topic messages on the board, but this could conceivable let people view messages that fall under certain topics or sentiments.
7
Jerry Fu
04-29-2008
04:02 PM ET (US)
Regarding how the user click data is used in the Google News paper, I went back and looked at a referenced paper, "Latent Semantic Models for Collaborative Filtering." The E-step is only calculated for (user, story) pairs that actually occurred. So the q* weights calculated in e-step are non-zero only for existing pairs, and these weights can be used to calculate conditional probabilities for stories given clusters and clusters given users.

This paper has more details. I highly recommend section 3.2 if you need to understand this algorithm.

Thomas Hofmann, Latent semantic models for collaborative filtering, ACM Transactions on Information Systems (TOIS), v.22 n.1, p.89-115, January 2004
http://delivery.acm.org/10.1145/970000/963...85&CFTOKEN=22509318
6
Matt RodriguezPerson was signed in when posted
04-09-2008
03:47 PM ET (US)

Where is Yuri Lifshits talk being held? I didn't see it on the CSE event listing.

Thanks
5
Charles Elkan
04-09-2008
12:49 AM ET (US)
Thursday April 10: Special lecture, same trailer as last week.

Instead of a regular 291 meeting, please go to a lecture by Prof. Sanjoy Dasgupta on the k-medoid problem. This is an abstraction of the following problem: where should Google locate its servers to be as close as possible on average to its users?

TIME: 3:30pm, Thursday April 10.

LOCATION: University Center 413A room 1.
This is a trailer just south of parking lot P403, which in
turn is just south of the Canyonview swimming pool.

Sanjoy's lecture is part of his course; see www.cs.ucsd.edu/~dasgupta/291
4
Aditya Menon
04-08-2008
04:29 PM ET (US)
In the "Automatic Identification of User Goals for Web Search" paper: for the anchor-link distribution, there is a fair assumption that there isn't a single authoritative site for most informational queries. However, in the future this could slightly incorrect, if sites like Wikipedia are more embraced (or for that matter Google's Knol gathers momentum, as it has the explicit objective of being the first site that users will want to visit to get information on a subject!). I wouldn't expect it would dramatically skew the distribution of links, but it suggests that user-click distribution might be slightly more robust.
Edited 04-08-2008 04:30 PM
3
Charles ElkanPerson was signed in when posted
04-04-2008
01:44 PM ET (US)
Advance notice: Wednesday April 9, 2pm.

Yuri Lifshits from Caltech will give a talk about nearest-neighbor algorithms. Yuri is one of the pioneers in identifying and solving basic algorithmic problems underlying large-scale web applications.

Do plan to attend his talk, and do explore his website at http://yury.name/
2
Charles ElkanPerson was signed in when posted
04-03-2008
01:19 PM ET (US)
Thursday April 3: Special lecture, unusual location.

Instead of a regular 291 meeting, please go to a lecture by Prof. Sanjoy Dasgupta on the k-means++ algorithm. This is the latest research on the k-means algorithm, which is the single most widely used method for unsupervised learning. The k-means++ algorithm is important theoretically and practically.

TIME: 3:30pm today, Thursday April 3.

LOCATION: University Center 413A room 1.
This is a trailer just south of parking lot P403, which in
turn is just south of the Canyonview swimming pool.

Sanjoy's lecture is part of his course; see www.cs.ucsd.edu/~dasgupta/291
1
Charles ElkanPerson was signed in when posted
03-31-2008
07:39 PM ET (US)
Welcome to CSE 291!