| Kristin Branson
|
5
|
 |
|
04-16-2001 11:47 AM ET (US)
|
|
I'm confused about a number of things in this paper ...
What is a VC-dimension? On a related note, what is "the structure" referring to?
On the bottom left of page 3, the author mentions that the transductive setting uses prior knowledge about the nature of P(x,y) that is not used in the inductive setting. My understanding of this algorithm is that it uses knowledge about the distribution of x in the test set that the inductive algorithm does not. Both algorithms have the same information about P(x,y) for the training examples, but the transductive algorithm has the extra knowledge about the distribution of x for the test set, not the distribution of y for the test set.
The P/R-breakeven point is defined in the paper as the value for which precision and recall are equal. Does the value refer to the number of documents classified as +? Can anyone explain the intuition behind why a high P/R-breakeven point is a good thing?
|