| Mohit Gupta
|
2
|
 |
|
03-06-2006 01:52 AM ET (US)
|
|
This paper presents an object detection framework capable of processing images at real-time speeds while achieving high accuracy. This paper introduces a few novel ideas (integral images) and also brings together many existing concepts (feature selection using AdaBoost and heirarchical classification).
However, I have a few concerns (which might be shared by others too):
0) Given the specific (and rather inflexible) nature of the features, I am wondering if this is indeed an object detection system, or just a frontal-face detection system. Although authors mention in passing towards the end that this can be applied to pedestrian detection, their features seem to be very specific for faces.
1) Another issue is that this method doesn't seem to be invariant to anything: pose or translation, even though authors mention that it can 'absorb' small translations. It seems to work well on frontal-face dataset; for more credibility, they should have provided results on other harder datasets.
2) The training time (order of weeks) seems too long. Although, I have to admit that I am not familiar with the 'par training time' for current systems.
3) Since they are working with small 24x24 patches, and have difference of intensities of image parts as features: I am just curious how well a naive pixel intensity based classifier would work (576 dimensional feature vector). These features will need a lot less time to train on?
|