| sameer agarwal
|
12
|
 |
|
04-11-2002 04:26 PM ET (US)
|
|
Here is some more on the VC dimension. The VC dimension is used as measure of the "capacity" of a classifer learning algorithm. By capacity we mean, how many different arrangements of a number of points can be classifed correctly by a classifer learned using the the algorithm.
A simple example is an algorithm which learns to draw a hyperplane in n-dimensions. The analytical result is that the VC dimension of the set of hyperplanes in n dimensions is n+1. Consider the case of n=2, the 2 dimensional plane. Take any three points and given then abritray biniary labels. It is easy to see that its possible to draw a line such that points with the same labels lie on the same side of the line.
However if we take four points. Consider an arrange of these points on the vertices of a square, with the points lying on the positive diagonal labeled 1 and the points lying on the negative diagonal labeled 0. In this case we cannot draw any line which will seperate the two classes into distinct regions of the space.
Hence in 2 dimensions the class of straight lines has a VC dimenions of 3. Notice that this does not mean that there do not exist arrangements of points > 3 which cannot be seperated using a straight line. All that it means is that 3 is the maximum number for which "all" possible labelings of the points can be seperated.
Now for any learning algorithm, its VC dimension is defined to be the VC dimension of the set of hypothesis (boundaries) that it can learn.
Finally, the reason why there is so much talk about the VC dimensions of learning algorithms is because of the seminal work of Vapnik and Chervonenkis on bounding the test error of a classifier in terms of its training error, and its VC dimension.
approximately the following inequality holds :
E_test <= E_train + O(sqrt(V))
where V is the VC dimension of the learning algorithm. So even though you might be able to get zero training error by choosing a large and complicated enough learning algorithm (high VC dimension) the test error need not necessarily be low.
|