QuickTopic free message boards logo
Skip to Messages


Recognizing an\ d Tracking Human Action

01:17 PM ET (US)
cheap glasses
cheap eyeglasses
buy glasses
Deleted by topic administrator 05-16-2008 08:08 AM
05:59 PM ET (US)
Hello, nice site :)
John Doe
07:48 AM ET (US)
  Messages 9-8 deleted by author between 06-15-2010 02:01 AM and 07-21-2006 08:58 AM
Shinko Cheng
05:21 PM ET (US)
They seem to have found a neglected approach in articulated human body modeling and tracking, which is the design of a robust "detection" scheme for articulated human bodies in the detect-and-track paradigm. I think it is good, although very complicated and cumbersome, to look for the most complete model, and then applying a kalman filter or smoother here and there.

I found a new appreciation for the work of finding feature invariant transforms after reading about the "topological type" and how with enough samples taken along its contour, shapes along its contour are allowed to deform and still be considered to have the same origin.
Neil Alldrin
06:24 AM ET (US)
I don't really know what to say... I think their technique is kinda cool, but will probably break down in more complicated contexts (tennis seems especially easy). Maybe I'll think of some questions during the presentation tomorrow.
Peter Schwer
03:19 AM ET (US)
I like the move to create a "closed-loop" system. But at what cost? Without video it is hard to say, but how effective is the closed-loop system? And how efficient?
Sunny Chow
01:50 AM ET (US)
The point correspondence algorithm they used though seemed really neat. Fast and efficient, it also incorporates global information in detecting the correspondences. And from the results, it seems to be fairly accurate.
Jing Shiau
12:35 AM ET (US)
Seems like this paper is more focused on recognizing action than tracking. Most of the paper seems to be focused on how to determine if two images belong to the same type of action and the so-called tracking is just figuring out the point correspondences between related images.
Meifang mentioned a point I was also wondering about: how does the rate of motion affect the tracking ability?
Looking forward to the presentation tomorrow for answers to these questions.
Meifang Huang
10:05 PM ET (US)
This paper simplified the human action recognition work into key frames matching. It works quite well when there is just one target model for recognition. Suppose we are given a database of several different actions, like walking, jogging or going upstairs/downstairs, will this method still output the best recognized action? We could image that some different models may share the same key frames or have similar patterns in these frames, e.g. jogging and running. One significant difference between these two actions is the pace. Obviously, the rate of action should also be taken into consideration when there are multiple action models. In this paper, they mentioned that they use simple Markov chain to prevent inappropriate key frames being chosen, does this imply that they give a specific order of key frames, if so will it allow deletion or insertion of these key frames when doing the matching?
Matt Clothier
05:06 PM ET (US)
I find the idea of doing activity recognition before 3D reconstruction interesting (since 3D reconstruction is probably the most common means of tracking humans). By using selective correspondences between the key frame and actual frame the locations of selected body parts (like hands and feet) can be found. I do have a few concerns about this technique though. First, is the coarse head and body tracking that they use. Their likelihood function is based upon a sum-of-squares distance measure between a color template and the image data. Although this may provide them with basic head and body localization, this can be easily broken if there happens to be an object in the image that has a similar color pattern (or if the person has a shirt on that matches the background). I am also concerned about the use of hand drawn key frames. If their technique was extended to recognize everyday activities such as walking, running, jumping, playing tennis, etc. then many, many key frames would need to be available for matching (which would take a lot of effort). In addition, notice that the keyframes in figure 10 have the person facing forward for the most part. What if the person turns around? Will the right hand become the left hand and vice-versa? Anyway, the technique has some promise, but at this point they make many assumptions about the data in order to get their technique to work. I am interested to hear other people's insights in defending their work.

Print | RSS Views: 2079 (Unique: 1006 ) / Subscribers: 0 | What's this?