| Aldebaro Klautau
|
4
|
 |
|
04-04-2001 06:45 AM ET (US)
|
|
The proposed features are interesting but I could not see how "over one million images can be scanned per second." The primitive filters are regular enough to allow tricks and efficiently compute the convolutions, as mentioned. But still, the number of operations seems to be high, unless one assumes small images. Instead of the two paragraphs after equation (4), mentioning kurtosis, etc., I would prefer some discussion about the computational cost.
I also think the paragraph about wavelets is too vague. The features proposed by the authors are not as sensitive to shifts as conventional wavelet decompositions because of the summation in equation (1). In image retrieval, as opposed to image coding, there is no concern with image reconstruction based on the feature representation, so the filters and operations are not required to have perfect reconstruction (PR) property, etc. The authors used this flexibility to adopt some sort of over-complete representation and then sum up all "coefficients" or filter outputs, which avoids spatial localization. If one ignores some of the requirements imposed to wavelets (or PR filter banks), similar results could be eventually obtained with a "wavelet-like" decomposition followed by the summation in (1). In my point of view, the particular choice of primitive filters should be determined mainly by computational cost. Again, this aspect was not exploited in the paper.
Besides the features, there is the boosting strategy. I have a question about the adaptation made by the authors to the original AdaBoost. Is this the first paper to associate each feature to one weak learner? I liked the paper, but in case of a positive answer, I would consider it a very good one because the idea of using boosting to select features is simple but very powerful.
|