Dave Kauchak
|
4
|
 |
|
05-14-2002 04:47 PM ET (US)
|
|
As Degui comments, I think the initial clustering into word clusters is a way to reduce the dimensionality of the data. I agree with Kristen in that their motivation for using clusters is not made very obvious, but I do think that it is to reduce the noisiness that might occur when using raw words. I think this sort of grouping is similar in effect to what is done with supervised learning methods which try and generalize from possibly noisy samples.
I thought the experiments showing that the double methods performed better than the single methods is a good start to show experimental motivation for using the word clusters over raw words. However, I was a bit disappointed with the experimental setup.
I'm glad the authors tried to get at a concrete data set, where performance could be measured easily. However, I think that the authors could have done a better job justifying their choices after they spent so much time explaining why previous setups were so bad.
The authors briefly mention stop-lists and word stemming. I think that they dismiss this idea too quickly. Why weren't these preprocessing methods used? Word stemming in particular seems like another method of reducing the word noise in the documents. The other methods may have performed better if word stemming or other equivalent processes were used.
|