| Sanjeev Kumar
|
4
|
 |
|
10-05-2004 05:20 AM ET (US)
|
|
On the 4th page of paper, Roweis says linear approximation of f(w) is only valid near a minimum. I don't understand why (at least as long as we interpret "near" in euclidean distance sense) ? I tried understanding on following lines.
Linear approximation of f(w) is valid in current neighborhood (1) => Quadratic approximation of E(w) is valid in current neighborhood (2) => we can reach minimum in 1 step (assuming exact validity) (3) => we are near minimum (4)
But implication (3) need not be true. It requires additional condition that neighborhood of (1) is large enough to contain minimum point and furthermore implication (4) would require different interpretation of "near".
One more (but unrelated) question: There are some methods ( e.g. Davidson-Fletcher-Powell) which update inverse of Hessian matrix based on secant equation, instead of computing it on every iteration, which can be very useful for large-sized problems. Is there any equivalent for updating inverse of (H + \lambda diag(H) ) so that it can be used in Levenberg-Marquardt algorithm ?
|