QuickTopic (SM) free message boards QuickTopic (SM) free message boards
Skip to Messages
  Sign In to access your topic list  |New Topic |My Topics|Profile
Upgrade to Pro   Customize, show pictures, add an intro, and more:   QuickTopic Pro...and check out QuickThreadSM
Topic: An adaptive regularization criterion for supervised learning
Views: 581, Unique: 296 
Subscribers: 0
What's
this?
Printer-Friendly Page
Subscribe to get & post, or stop messages by email Subscribe
All messages    << 5-10  4-4 of 10  1-3 >>
About these ads
Who | When
Messagessort recent-top   
Post a new message
 
Sameer Agarwal  4
04-11-2001 06:50 AM ET (US)
hi,
frankly I do not understand how or why the technique is working. My confusion comes from the arbitrariness of the choice of the basline function phi.

since we are choose whatever we want.. atleast the paper does not suggest anything.. why not just chose phi(x) = 0, and not keep it in the equation at all ? Removes one more parameter from your test setup.. and can make it simpler still. In that case all that we are doing is .. assuming that we have enough unlabelled points available.. in the limit we get..

is

 \sum h(x_i)^2
----------
   E (h(x)^2)

in which case it become a comparison of how well the second moment of the h(x) is captured by the empirical second order moment of h(x) . Since deviatons from it are penalized.

so is generalization capbility captured by how well the second moment of hypothesis is captured by the empirical estimate just using the training data.. one nice thing is.. that as the number of training samples go up.. the term ratio goes to 1 as expected. Since in the limit of infinite training data we will be able to construct a perfect fit.

so my questions are :

1. Why do you need phi(x) ?
2. If you indeed need phi(x).. what is the criterion that you can use to select one..
3. (on a slightly wild note ) is it possible to prove a bound like which connects .. d(phi(x),f(x)) (where f(x) is the actual function underlying the data,) with how good is phi(x) in predicting the goodness of fit. I get the idea from A* where the closer the heuristic h is to the actual distance h* the lesser is the number of states one has to explore...


finally I agree with Greg, I think the technique will not be able to differentiate between failry similar hypothesis.. on the other hand.. perhaps that is what phi(x) is supposed to do.. that is if we use a nontrivial phi(x) we will be able to do the differentiation.

Also the authors do not mention what phi they used for the polynomial regression test.

sameer
RSS link What's this?
All messages    << 5-10  4-4 of 10  1-3 >>
QuickTopicSM message boards
Over 200,000 topics served
Learn more Frequently asked questions  Acknowledgements
What they're saying about QuickTopic
 Questions, comments, or suggestions? Contact Us
Read our use policy before beginning. We value your privacy; please read our privacy statement.
Copyright ©1999-2008 Internicity Inc. All rights reserved.