Greg Hamerly
|
5
|
 |
|
04-30-2002 03:22 PM ET (US)
|
|
This was a good paper, and well-written. I really liked the idea of section 2.1, where they define a probability of mixture overlap (and thus of discriminative power).
I think figure 2 is confusing and misleading. My understanding is that the authors would choose features first based on epsilon [figure 2(a)], then from that subset they would use information gain [2(b)], then finally from that subset use the Markov blanket. However, in 2(a) they only show genes which have epsilon < 0.5, which is about 4500 genes. But then in 2(b) they show information gain for apparently all of the data. From 2(b) to 2(c) they seem to be consistent and take only 360 genes filtered by information gain.
I agree with Victor about the comparisons with other methods. I am not familiar with what is state of the art for gene expression classification, but perhaps SVMs or some other approach would be nice to see.
Overall, good paper.
|