| Gyozo Gidofavi
|
7
|
 |
|
10-23-2001 12:40 PM ET (US)
|
|
I agree with the comment made by David about the pseudo loss. I was a little confused about its intuitive meaning and purpose.
As a reply or extension to the message posted by Hsin-Hao, i have an intuitive feeling that one can find real life examples where a series of human experts learn in a similar manner to AdaBoost, but i could not find and instance of this myself yet. I would be interested, if any of you would have an example that would be applicable.
I found the remark made about validity of Occam's razor in 4.1 of the first paper very interesting. At first i thought that one of the reasons why AdaBoost's performance is so exceptional is that it finds "independent, new" weak classifiers even when the training error of the previous classifier was 0. This could explain, how it would be still possible for AdaBoost to improve the test error when the training error is already 0. It would also possibly add an advantage to AdaBoost over traditional gradient descant methods which can easily overfit and are not able to improve on the test error once the training error is 0. When looking at the formula for the calculation for the distribution D_t+1 in the algorithm it was not clear however what this distribution would be in case the previous wear classifier had 0 training error.
|