| Dave Kauchak
|
1
|
 |
|
06-03-2001 09:11 PM ET (US)
|
|
I thought that the idea of including information from linked sites very interesting.
I particularly liked the introduction of the paper. They give a good chain of ideas that lead to the development of their system as well as the motivations for each step that they take. The ideas are explained in a general way so that even with minimal knowledge, a good idea of the papers goals and methodologies can be obtained.
Another nice thing that the paper does is build on existing research. The paper builds up their system from a general text classifier. The nice thing about this is that any improvements to basic text classifiers will result in gains in their system with little or no work.
I found some of the experiments a bit confusing. Their first experiment tests three different cases. In the third case they distinguish between local and non-local data. I am not exactly clear how they actually used this information, though.
I liked how the paper showed the performance increase with an increase in knowledge about the neighbors. I thought it was quite interesting that they go such good results from the prefix+Link even without any neighbor classifications. However, I would have appreciated some discussion of the likeliness or difficulty of actually having that many neighbors classified correctly. When we are dealing with such large datasets, even pre-classifying some small percentage may be extremely time consuming.
Dave
|