| Robin Hewitt
|
4
|
 |
|
10-11-2004 04:59 PM ET (US)
|
|
I think the value of this method depends a lot on how important the tree is to you. Often the tree is just an artifact of the clustering method (e.g., Wards). Typically, one is only interested in a few levels because near the root, the groupings are too diverse to be meaningful and near the leaves you start losing track of things you want to keep together. For example, if you have 20 identical somethings, you don't care at all that this group can theoretically be split down 19 more times.
Given that, I think it would have been good for the authors to compare their method to standard industry pratices for combining Wards and K-means. They're working with such small datasets (n<1000) that it's very feasible to first cluster by Wards, stop at some level, then improve the result with K-means. You'd lose the hierarchy, but as mentioned, that's often of little interest. Or, they could split those clusters divisely to recover a new lower-portion of the tree if they wanted it, using their same cost metric. I would have liked to see a comparison of their method with this type of standard practice.
- Robin
|