QuickTopic (SM) free message boards QuickTopic (SM) free message boards
Skip to Messages
  Sign In to access your topic list  |New Topic |My Topics|Profile
Upgrade to Pro   Customize, show pictures, add an intro, and more:   QuickTopic Pro...and check out QuickThreadSM
Topic: Learning to Detect Objects in Images via a Sparse, Part-Based Representation
Views: 597, Unique: 237 
Subscribers: 2
What's
this?
Printer-Friendly Page
Subscribe to get & post, or stop messages by email Subscribe
All messages            1-9 of 9        
About these ads
Who | When
Messagessort recent-top   
Post a new message
 
Adam  1
11-08-2006 08:42 PM ET (US)
It's interesting they used such simple interest points and similarity measures (Forstner, cross-correlation). Do you think they could have improved performance by using more robust features?
Deborah  2
11-09-2006 05:17 AM ET (US)
Edited by author 11-09-2006 05:18 AM
I can't find Forstner & Gulch's 1987 paper, "A fast operator for detection and precise location of distinct points, corners, and centers of circular features," where he introduces the Forstner interest operator! Will you please go over how it works? Thank you!

Comments:

*I really like how well-written this paper is, and how they define so many of the computer vision terminology, especially for evaulation methods & scores.

*FYI, here is a link to MATLAB source code for Forstner interest operator:
http://www.vision.caltech.edu/html-files/E...rojects/foerstner.m
Boris  3
11-09-2006 12:41 PM ET (US)
There are many papers that use a similar approach for object detection/recognition (cluster local features, represent images as bag of words - we saw this done in the paper Carolina presented for example). What is the novelty of this paper? Was this one of the first papers to do this?
Matt  4
11-09-2006 01:35 PM ET (US)
Edited by author 11-09-2006 01:37 PM
It's somewhat disappointing that during vocabulary construction their clustering reduced the number of parts by only about a third. Given the success of a couple of the features (tires), this seems to imply that most features did not find a good match in the other 49 images in the training set. This makes me somewhat surprised that you'd see the huge decrease in performance when you remove the clustering stage that you see in figure 10. I wonder how things would change if they increased the number of interest points; it looks like they only get about 8 per image, which seems small particularly if you're going to cluster the points. Using a different measure of similarity might also improve clustering - correlation isn't shift invariant.
Carolina GalleguillosPerson was signed in when posted  5
11-09-2006 04:11 PM ET (US)
I think this was THE paper that introduced the concept of the visual vocabulary. I wonder how was the performance of this approach for other categories that don't have many geometric advantages (I guess not that good).
Tingfan  6
11-09-2006 04:21 PM ET (US)
Multiple detections on same object always happens when we use a sliding window to detect the objects. I come up with an idea that we can
(a) randomly segment the testing image into segments.
(b) do windowed detection on the testing image.
(c) count the number of activated windows/confidence falling in each segments.
(d) output the highest segments

Then we can have both detection and good localization boundary
Tom  7
11-09-2006 04:46 PM ET (US)
I see a relatively easy SUGAR. More importantly, I really like that they limit the accuracy measures by taking into account the huge disparity between positive count and negative count. Personally I just use the metric of binary "Found everything with/without false positives", versus "missed something critical". I agree with Tingfan that the whole nearest neighbor suppression is inherent in all of the sliding window approaches. I agree that segmentation and selection of regions would be a good solution to that, but it might push things out of the nice realish time area.
Iman  8
11-09-2006 04:58 PM ET (US)
Edited by author 11-09-2006 04:58 PM
Before reading this paper, the only place I had heard of nonmaximum suppression being used is in the canny edge detector. Is it a commonly used general technique, and if so, where else is it used?
Marius  9
11-09-2006 05:40 PM ET (US)
Classification seems a lot like a modified Non-Holographic Associative Memory, by Wilshaw (1969), the central point of Hecht-Nielsen confabulation theory.
RSS link What's this?
All messages            1-9 of 9        
QuickTopicSM message boards
Over 200,000 topics served
Learn more Frequently asked questions  Acknowledgements
What they're saying about QuickTopic
 Questions, comments, or suggestions? Contact Us
Read our use policy before beginning. We value your privacy; please read our privacy statement.
Copyright ©1999-2008 Internicity Inc. All rights reserved.