top bar
QuickTopic free message boards logo
Skip to Messages


statistics of natural image categories

Carlos Vallespi
09:15 AM ET (US)
I agree with Pete that this method alone leads to good results with the kind of images chosen in the paper. However, it is very interesting how a simple look at the frequency domain can tell the difference between these two sets of images. This might not be enough for other tasks but at least gives an initial estimate of what kind of image and mean depth we are dealing with. I would like to see this SPC kind of representation working with EMD.
Mohit Gupta
04:38 AM ET (US)
Indeed this method will not be able to tell a cat from a dog, if they always occur in the same environment. However, I would not term this as 'breaking down' of the method as the motivation is not to do exact identification. Since their method is based only on global statistics of the image, it can be used as a **fast** pre-processing step for object detection. For example, in case of object detection, many images or large parts of images can be pruned out at the preliminary stage.

To acurately recognize an object (a cat or a dog), one can then do a local search on the remaining data. Performing local search can be very expensive, which justifies using this method for pre-processing.
Stefan Zickler
01:17 AM ET (US)
I agree with Pete, and also think that this methodology of image statistics probably only works for rather global correlational statements such as guessing whether an image contains cars or not, solely based on its statistics. My prediction is that this methodology will break down as soon as we try to predict the existence of different objects that occur in similar environments (e.g. scenes with dogs vs. scenes with cats, instead of just the rather global category "animals").
Pete Barnum
07:53 PM ET (US)
Torralba and Oliva's results look pretty good for the images that they used, but based on the few samples that they showed, they already represent a sampling of scenes that people find interesting. Even the "panoramic" views seem a bit restricted. For instance, they show a picture of a pedestrian, but it seems equally likely that a random picture from that environment would have half a pedestrian or just a brick wall. I'd be interested to see if they would be able to get interesting results from a 360 degree spherical panorama. I imagine that a large view would have distinctive characteristics, whether they would be from changing depth estimations or different content.
Edited 01-28-2006 07:58 PM
Dave BradleyPerson was signed in when posted
09:37 PM ET (US)
Please post here on Torralba's "Statistics of Natural Image Categories"

Print | RSS Views: 2101 (Unique: 905 ) / Subscribers: 1 | What's this?