| Joe Drish
|
3
|
 |
|
04-10-2002 06:12 AM ET (US)
|
|
Edited by author 04-10-2002 06:13 AM
It's amazing how well the theoretical results compare to the experimental results. It would have been convenient to provide the sample sizes of the datasets (correct me if I'm wrong, but I don't think the numbers corresponding to m on the horizontal axes are raw numbers), but I suppose those can be determined by looking them up at the UCI repository. Second, I'm curious to know why logistic regression has not been praised or used much in commercial data mining applications, given the conventional folk wisdom "articulated" by Vapnick as discussed in the paper's introduction, and also given the results presented here. Naive Bayes, not logistic regression, is highly touted by many data mining experts.
Elkan claims in "Boosting and Naive Bayesian Learning", 1997, that naive Bayesian classification is just a nonparametric, nonlinear extension of logistic regression. Is the analysis conducted by Ng and Jordan even necessary, given that naive Bayes is just an extension of logistic regression? The main difference between the two methods is that naive Bayes requires an intermediate step, but is this the only high level conceptual difference?
I take issue with the way the results are expressed - that the generative model approaches its asymptotic error [faster] than the discriminative model. It does so, but naive Bayes does start out initially with a much lower error. Empirically, the slopes of the curves are very much alike as the the training size (m) is increased. They do say that this is what is happening, but not clearly.
|