QuickTopic (SM) free message boards QuickTopic (SM) free message boards
Skip to Messages
  Sign In to access your topic list  |New Topic |My Topics|Profile
Upgrade to Pro   Customize, show pictures, add an intro, and more:   QuickTopic Pro...and check out QuickThreadSM
Topic: Transductive inference for text classification using SVMs
Views: 564, Unique: 380 
Subscribers: 0
What's
this?
Printer-Friendly Page
Subscribe to get & post, or stop messages by email Subscribe
Who | When
Messagessort recent-bottom   
Post a new message
 
Hector Jasso  7
04-16-2001 01:02 PM ET (US)
I also have my doubts about how appropriate the Precision/Recall-Breakeven Point is
for measuring effectiveness of a text retrieval algorithm. Although it is a common
measure for evaluating text classifiers, as the authors mention, they introduce a
new definition for their algorithm! Or at least they should talk about how the confidence
value for Naive Bayes classifiers/SVM translates to the corresponding measure in TSVM's,
which was just defined as the point where the number of false positives being equal to
number of false negatives.

On a different point, I see a parallel between this paper and the one on boosting image
retrieval in that they both talk about sparse problem instances. And I think that although
it is valid to derive new algorithms for these types of problems, care should be taken as
to the type of benchmarks that are appropriate for testing.
Sameer Agarwal  6
04-16-2001 12:48 PM ET (US)
Hi,

GIven a class of functions, Vapnik-Chervonenkis Dimension is a measure of how flexible this class of functions is. Intuitively it measure the smallest number of points "n' such that they can be labelled arbitrarily and yet be classified correctly by atleast one function in the class of functions under consideration. So its a measure of how powerful is the class of functions you are considering.

I do not agree with Greg about it being an EM for SVMs. Since is no Expectation step, the correction in the inner loop is just a greedy hillclimb.

sameer

sameer
Kristin Branson  5
04-16-2001 11:47 AM ET (US)
I'm confused about a number of things in this paper ...

What is a VC-dimension? On a related note, what is "the structure" referring to?

On the bottom left of page 3, the author mentions that the transductive setting uses prior knowledge about the nature of P(x,y) that is not used in the inductive setting. My understanding of this algorithm is that it uses knowledge about the distribution of x in the test set that the inductive algorithm does not. Both algorithms have the same information about P(x,y) for the training examples, but the transductive algorithm has the extra knowledge about the distribution of x for the test set, not the distribution of y for the test set.

The P/R-breakeven point is defined in the paper as the value for which precision and recall are equal. Does the value refer to the number of documents classified as +? Can anyone explain the intuition behind why a high P/R-breakeven point is a good thing?
Greg Hamerly  4
04-16-2001 11:06 AM ET (US)
First, the TSVM algorithm seems to me like a sort of EM for SVMs. Better yet, it seems even more like K-means because there is no partial assignment, only hard assignment, of test documents to the positive/negative classes. If this is the case, I think they should have used common terminology to describe this, rather than the term "transductive", which I don't think they defined well.

Second, they report average P/R breakeven numbers. They should have also reported the standard deviations for each of these numbers, to allow the reader to see if the averages are significantly different. Also, they say that the P/R breakeven point is a standard metric, but I would have appreciated the exact equation for computing that number.

I have general other gripes about this paper's over-use of mathematical symbols and complexity, I feel it could have been simplified.
Dave Kauchak  3
04-16-2001 03:15 AM ET (US)
I thought the paper did a relatively good job of selecting and analyzing a number of diverse situations. The paper not only tries to show that TSVMs are better for the problem at hand, they also explain a number of different variables that might be varied.

More explicitly, to show that TSVMs were better, the paper presented test averages with training sets ranging from 7 to 120 and test sets from 3,299 to 10,00. The paper went on, however, to show the effect of varying the sizes of these two different sets.

However, there were a number of factors with respect to testing that I felt the paper could have explored better. First, all of the examples that were chosen represent tests containing a relatively small set of categories. In fact, many of the categories were actually trimmed for experimentation (for example in the case of the Reuters database, there were 135 potential but on the 10 most frequent were chosen). I would like to have seen more justification for this size reduction and also the effect that category size has on the effectiveness of the algorithm.

Second, the paper mentions that there are a number of parameters that are input in to the algorithm by the user. These seem to be only mentioned in the algorithm description section. There is no mention of these tuning parameters in the actual experimentation section. I would be curious what effect these parameters have on the effectiveness of the TSVM and also how difficult it is to fine-tune these parameters.
Joe Drish  2
04-10-2001 12:07 AM ET (US)
Another very small but fishy mistake in the paper can be found in Figure 5: the reported average P/R breakeven point for the TSVM. The last row, the average, is computed by just adding the numbers in each row and dividing by 10. If you add up the numbers in the TSVM column and divide by 10, you get 61.3, not 60.8.

All of the other numbers work out correctly, including those in figures 8 and 9.
Joe Drish  1
04-09-2001 01:51 PM ET (US)
Edited by author 04-09-2001 01:51 PM
There are a few simple typos in Section 4 in the paper where the example in Figure 3 is discussed.

1. "How should we classify documents D2 to D4..." should be "How should we classify documents D2 to D5..."

2. "...into class A, and D3 and D4 into class B" should be "...into class A, and D4 and D5 into class B".

3. "Assigning D2 and D3 to class A and D3 and D4 to class B is the maximum..." should be "Assigning D2 and D3 to class A and D4 and D5 to class B is the maximum..."

At least these changes make more sense.
RSS link What's this?
QuickTopicSM message boards
Over 200,000 topics served
Learn more Frequently asked questions  Acknowledgements
What they're saying about QuickTopic
 Questions, comments, or suggestions? Contact Us
Read our use policy before beginning. We value your privacy; please read our privacy statement.
Copyright ©1999-2006 Internicity Inc. All rights reserved.