| Greg Hamerly
|
3
|
 |
|
05-14-2001 02:23 AM ET (US)
|
|
Hector, the paper describes a learning algorithm for smoothing probability estimates of vocabularies within HMM states. The structures, then, are fixed and not learned. What is learned are the HMM state vocabularies, transition probabilities, and mixture parameters for shrinkage.
While you're right that simple HMM structures may not be good at extracting from very differently-structured sources, it can still learn a large vocabulary from different sources with similarly structured documents.
I hope to make clear some of the confusions tomorrow in my talk. Something I'm still a bit unclear on is what, exactly, is in the holdout set H_j they discuss in the section on EM. More detail would be useful here. However, I believe H_j is simply the words that have been annotated as having label "j" in the training documents (i.e. j = target, non-target, etc.).
Overall I think the method is interesting and fairly clear; what is really needed though is more conclusive empirical results. All the results leave one wondering what is really the best choice for HMM information extraction.
|