QuickTopic free message boards logo
Skip to Messages


A global matching framework for stereo computation

03:52 AM ET (US)
Andrew: The paper says that 3 different approaches can be taken to choose the regions/segments:
1) Color Based segmentation
2) Egde based triangulation.
3) Point based triangulation.

However, only color based segmentation is used partly because it is easy to implement and party because it is reliable.

Color segmentation is a well studied field and usually we dont ask the question " how accurate it is" but "how much segmentation". Your question now becomes very pertinent and it indeed IS one of the primary aspects of the paper. The answer is : We want to "over-segment" the image by which I mean we would like to segment even regions which look homogeneous if possible. This ensures that even curved surfaces are modeled as piecewise planar ( just as in graphics most curved objects are composed of triangles which are planar) and we dont smooth out the curve by planar approximation.
Edited 10-08-2002 03:54 AM
03:43 AM ET (US)
Kristin: For each segment, we have the initial depth computed using any standard correlation based dense stereo technique. The author has make a simplistic assumption that each segment has k-neighbors ( which maynot be true but here we are just interested to get a "feel" of complexity involved ). With the above assumption we say that a segment's depth is either equal to its initial depth or equal to the depth of one of its k neighbors. Hence for each segment, we have k+1 guesses of depth. Since we have s no. of segments, the depth map can have (k+1)^s different structures.

I am not sure about the term AAM but I know of algorithms that use projection of 3D back to 2D to test the validity of generated 3D model.
Andrew Rabinovich
03:17 AM ET (US)
Although, this is not the primary aspect of the paper, I am curious is to how crucial is the accurate color segmentation of the image. In other words, how much of the success of the algorithm depends on the correct color recognition?
Andrew Rabinovich
03:09 AM ET (US)
Section 2.2 begins with the decompising of the image into regions. Majority of the description of the approach to follow, is based on separate regions rather then the whole image. I am not sure how the regions are selected? It seems that choosing the appropriate regions would be crucial to the model.
Kristin BransonPerson was signed in when posted
02:22 AM ET (US)
This algorithm reminded me of the Active Appearance Model algorithm, which is used to estimate the shape of an image by trying out different parameterizations of the shape, warping a generated image of average shape to fit the generated shape, and testing the similarity between the warped image and the actual image. So it's like the depth map is the shape in AAM.

I'm confused about Section 2.2. Am I right to think the neighborhood hypothesis is that, for unmatched regions, the depth is probably the depth of it's neighbors? Why are there only k+1 depth hypotheses for each segment? Is that just it's ordinate depth compared to its neighbors? Aren't you trying to get some sort of relative measure? Maybe my confusion on this matter is causing my added confusion about why there are (k+1)^s total hypotheses. It seems like this calculation is too simple and is double counting.
12:59 AM ET (US)
Josh: I think you are quite correct in pointing out that the algorithm you suggest would give quite good results, however the beauty of the algorithm presented is that it has an analysis-by-synthesis approach. It tests at each stage the "goodness" of the depth map and improves in an iterative manner. No doubt that it seems to be computationally very expensive but the quality of results is incredible. The algorithm you suggest would give a depth map in one shot but it would not probably correct wrongly initialized depth map.
Josh Wills
06:46 PM ET (US)
First of all, I think that the results in the paper are incredible. The image of a neighborhood from overhead where the houses are recognized as having a smaller depth level than the ground is quite incredible as there is so little parallax in the sequence since the scene is so far from the camera.

My concern is primarily over the running time of this algorithm. I am not sure how large the gain is over a simpler approach that pools the votes for a depth layer over a segment and gives the segment the most popular depth level. For the regions in the building images which have noise when the pixels are assigned individually, a segment pooling would work well in exactly the areas that the authors point out as problematic.
05:01 PM ET (US)

In the paper towards the end of page 3 in equation 2, u find a constant called beta and a few lines later u see alpha in page 4. The two constants are the same.

Print | RSS Views: 1249 (Unique: 758 ) / Subscribers: 1 | What's this?