QuickTopic (SM) free message boards QuickTopic (SM) free message boards
Skip to Messages
  Sign In to access your topic list  |New Topic |My Topics|Profile
Upgrade to Pro   Customize, show pictures, add an intro, and more:   QuickTopic Pro...and check out QuickThreadSM
Topic: Document clustering using word clusters via the information bottleneck
Views: 611, Unique: 433 
Subscribers: 1
What's
this?
Printer-Friendly Page
Subscribe to get & post, or stop messages by email Subscribe
All messages    << 5-8  4-4 of 8  1-3 >>
About these ads
Who | When
Messagessort recent-bottom   
Post a new message
 
Dave KauchakPerson was signed in when posted  4
05-14-2002 04:47 PM ET (US)
As Degui comments, I think the initial clustering into word clusters is a way to reduce the dimensionality of the data. I agree with Kristen in that their motivation for using clusters is not made very obvious, but I do think that it is to reduce the noisiness that might occur when using raw words. I think this sort of grouping is similar in effect to what is done with supervised learning methods which try and generalize from possibly noisy samples.

I thought the experiments showing that the double methods performed better than the single methods is a good start to show experimental motivation for using the word clusters over raw words. However, I was a bit disappointed with the experimental setup.

I'm glad the authors tried to get at a concrete data set, where performance could be measured easily. However, I think that the authors could have done a better job justifying their choices after they spent so much time explaining why previous setups were so bad.

The authors briefly mention stop-lists and word stemming. I think that they dismiss this idea too quickly. Why weren't these preprocessing methods used? Word stemming in particular seems like another method of reducing the word noise in the documents. The other methods may have performed better if word stemming or other equivalent processes were used.
RSS link What's this?
All messages    << 5-8  4-4 of 8  1-3 >>
QuickTopicSM message boards
Over 200,000 topics served
Learn more Frequently asked questions  Acknowledgements
What they're saying about QuickTopic
 Questions, comments, or suggestions? Contact Us
Read our use policy before beginning. We value your privacy; please read our privacy statement.
Copyright ©1999-2008 Internicity Inc. All rights reserved.