Subtitles section Play video
Hello, I will explain how SVM algorithm works. This video will explain the support vector
machine for linearly separable binary sets Suppose we have this two features, x1 and
x2 here and we want to classify all this elements You can see that we have the class square
and the class rectangle So the goal of the SVM is to design a hyperplane,
here we define this green line as the hyperplane, that classifies all the training vectors in
two classes Here we show two different hyperplanes which
can classify correctly all the instances in this feature set
But the best choice will be the hyperplane that leaves the maximum margin from both classes
The margin is this distance between the hyperplane and the closest elements from this hyperplane
We have the case of the red hyperplane we have this distance, so this is the margin,
which we represent by z1 And in the case of the green hyperplane we
have the margin that we call z2 We can clearly see that the value of z2 is
greater than z1 So the margin is higher in the case of the
green hyperplane, so in this case the best choice will be the green hyperplane
Suppose we have this hyperplane, this hyperplane is defined by one equation, we can state this
equation as this one We have a vector of weights plus omega 0 and
this equation will deliver values greater than 1 for all the input vectors which belongs
to the class 1, in this case the circles And also, we scale this hyperplane so that
it will deliver values smaller than -1 for all values which belongs to class number 2,
the rectangles We can say that this distance to the closest
elements will be at least 1, the modulus is 1
From the geometry we know that the distance between a point and a hyperplane is computed
by this equation So the total margin which is composed by this
distance will be computed by this equation And the aim is that minimizing this term will
maximize the separability When we minimize this weight vector we will
have the biggest margin here that will split this two classes
To minimize this weight vector is a nonlinear optimization task, which can be solved by
this conditions (KKT), which uses Langrange multipliers
The main equations state that the value of omega will be the solution of this sum here
And we also have this other rule. So when we solve these equations, trying to minimize
this omega vector, we will maximize the margin between the two classes which will maximize
the separability the two classes Here we show a simple example
Suppose we have these 2 features, x1 and x2, and we have these 3 values
We want to design, or to find the best hyperplane that will divide this 2 classes
So we know that we can see clearly from this graph that the best division line will be
a parallel line to the line that connects these 2 values here
So we can define this weight vector, which is this point minus this other point. So we
have the constant a and 2 times this constant a
Now we can solve this weight vector and create the hyperplane equations considering this
weight vector We must discover the values of this a here
Since we have this weight vector omega here, we can substitute the values of this point
and also using this point we can substitute these 2 values here
When we place the equation g using the input vector (1,1) we know that we have the value
-1 because this belongs to the class circle So we will have this value here, when we use
the second point, we apply the function and we know that it will deliver the value 1
So we substitute here in the equation also Well, given 2 equations we can isolate the
value of omega 0 in the second equation and we will have omega 0 equal to 1 minus 8 times
a So, using this value, we put the omega 0 in
the first equation and we will reach the value of a, which is 2 divided by 5
Now we discover the value of a and now we substitute the first equation and also discover
the value of omega 0 So by dividing here we will come to the conclusion
that omega 0 is minus 11 divided by 5 and since we know that the weight vector is a
and 2 a we can substitute the value of a here and we will deliver these values of the weight
vector So in this case, these are called the support
vectors because they compose the omega value 2 divided by 5 and 4 divided by 5
And we substitute here the values of omega (2 divided by 5 and 4 divided by 5) and also
the omega 0 value we will deliver the final equation which defines this green hyperplane
which is x1 plus 2 times x2 minus 5.5 And this hyperplane classifies the elements
using support vector machines These are some references that we have used
So this is how SVM algorithm works