| Jonathan Ultis
|
10
|
 |
|
04-13-2001 06:37 PM ET (US)
|
|
The multiplicative error is harder to get a handle on then the additive error. It doesn't shift the location of zero error. In addition, it doesn't add any zero error locations since max(d_r/d_t, d_t/d_r) is always 1 or larger. However, it looks to me like it might introduce new local minima.
I don't really want to take the derivative of it because I'm lazy, but given
let u = sum_r(((h(x_r) - y_r))^2)/r (mean squared error) let v = sum_r(((h(x_r) - O(x_r)))^2)/r let w = sum_t(((h(x_t) - O(x_t)))^2)/t let s = uv
then the error is in the form
error = s/w
the derivative is
(w ds/dx - s dw/dx) / w^2
which is
(w (u dv/dx + v du/dx) - uv dw/dx) / w^2
the polynomial in the numerator could be at most order 5 and the polynomial in the denominator could be at most order 4, since w, u, and v are quadratic and (dv, du, dw)/dx is linear.
If things don't cancel out, it could be that the new error function creates a surface with many more local minima then the original error surface. Could it be that this method works by causing learning routines to get stuck in well placed local minima instead of allowing them to find the global minimum?
Anyone want to solve the derivative and find out?
|