| Who | When |
Messages | |
(not accepting new messages)
|
|
| |
Messages 241-239 deleted by topic administrator 07-15-2009 12:31 PM |
| Charles Elkan
|
238
|
 |
|
06-27-2009 02:24 AM ET (US)
|
|
Subject: Netflix Prize: Last Call for Grand Prize
Date: Sat Jun 27 00:29:54 2009 UTC From: noreply@netflixprize.com
As of the submission by team "BellKor's Pragmatic Chaos" on June 26, 2009 18:42:37 UTC, the Netflix Prize competition entered the "last call" period for the Grand Prize. In accord with the Rules, teams have thirty (30) days, until July 26, 2009 18:42:37 UTC, to make submissions that will be considered for this Prize. Good luck and thank you for participating!
|
| |
Messages 237-236 deleted by topic administrator 07-14-2009 01:40 PM |
Charles Elkan
|
235
|
 |
|
06-12-2009 06:24 PM ET (US)
|
|
Final exam and letter grades
The final exam was out of 139. The mean score was 94, with standard deviation 19.
Overall, I assigned six letter grades of A- and higher, and seven of B+ and lower.
If you'd like to pick up your final exam, please email me after July 6.
|
|
234
|
 |
|
06-12-2009 07:20 AM ET (US)
|
|
Deleted by topic administrator 06-12-2009 06:01 PM
|
| Charles Elkan
|
233
|
 |
|
06-09-2009 12:01 PM ET (US)
|
|
Final exam is today at 6:30pm
In our usual room, CSE 2154. The exam is two hours long. You will be able to stay until 9pm.
You may bring and use all materials handed out in class, printouts from the website and my notes, any handwritten notes of your own, copies of your own quizzes and assignments, and a calculator.
You may also bring one book, but I don't expect any books to be especially useful.
|
| Charles Elkan
|
232
|
 |
|
06-09-2009 11:58 AM ET (US)
|
|
|
| Charles Elkan
|
231
|
 |
|
06-08-2009 11:43 AM ET (US)
|
|
Last quiz and assignment
Out of 6 and 10 respectively, as usual.
Quiz: mean 4.7, stdev 1.3. Assignment: mean 7.8, stdev 1.2.
|
| Charles Elkan
|
230
|
 |
|
06-06-2009 02:09 PM ET (US)
|
|
Edited by author 06-06-2009 02:10 PM
/m229: a well-calibrated probability means that at a given probability the % of positive examples with the same or higher probability is the same as the probability value. Not "or higher": at a given probability the % of positive examples with *the same* probability is the same as the probability value. In order to verify calibration, you have to look at the same probability +/- some tolerance. With a prediction threshold of 0.5, all examples are predicted to be negative, so the error rate is 5%. The standard threshold is 0.5 because that is optimal if false negatives and false positives have the same cost. Yes, error rate os 1-accuracy. What is the max possible lift at 10% of this classifier? In this context, 10% means taking the tenth of test examples with the highest predicted scores. There are many classifiers (SVMs for example) that output real-valued scores that are not well-calibrated. For these classifiers, it makes sense to rank test examples and consider the highest-ranked ones. These are the examples with highest probability, even though we don't know what their actual probabilities are.
|
| Dave
|
229
|
 |
|
06-06-2009 01:17 PM ET (US)
|
|
Edited by author 06-06-2009 01:54 PM
I have a question about the meaning of well-calibrated probability and some of the questions on the quiz that we had concerning this.
As I understand it a well-calibrated probability means that at a given probability the % of positive examples with the same or higher probability is the same as the probability value. Is this correct?
On part b your answer says "With a prediction threshold of 0.5, all examples are predicted to be negative, so the error rate is 5%....". How did you pick a threshold of 0.5? Or is this supposed to be 0.05? Also is the "error rate" defined to be 1 - accuracy?
Based on this understanding, the part (c) question asks "What is the max possible lift at 10% of this classifier." I am left wondering what does the 10% mean? Is that a 10% probability value, or something else? If it is the probability value then if 0.1 is between (or equal to) a and b then wouldn't the answer be 0.1/0.05=2?
|
Charles Elkan
|
228
|
 |
|
06-02-2009 08:21 PM ET (US)
|
|
/m199: For anyone interested in yesterday's talk "MINING LARGE-SCALE CELL PHONE DATA" by Jean Bolot from Sprint. the paper was published at KDD last year and can be found at https://research.sprintlabs.com/publications/uploads/dpln.pdf
|
Charles Elkan
|
227
|
 |
|
06-02-2009 06:48 PM ET (US)
|
|
/m226: Thanks for finding this interesting paper. The first author is now a professor of Biology at UCSD, coincidentally.
|
| Kristen Jaskie
|
226
|
 |
|
06-02-2009 05:33 PM ET (US)
|
|
/m224: There is a paper I found titled: "Incremental and Decremental Support Vector Machine Learning" that was published in NIPS 2000 and so would have been available at the time this paper was written. The abstract states that "An on-line recursive algorithm for training support vector machines, one vector at a time, is presented." I believe that this is the definition of online that the authors of our paper used. http://cbcl.mit.edu/projects/cbcl/publicat...enberghs-nips00.pdf
|