LIBSVM

Intelligent Systems

Dr. Abdul Rahim Ahmad

Using LIBSVM as a tool for classification and prediction

Omar Salah Farhan (ST21058)

Anes .A.Shaker (ST21059)

The contributions in this project work: We completed the work of the project together.

Analysis with LibSVM

Data is setup in to analysis with the tools of LibSVM. Our data contains two classes, each N samples. The data is given as and used as 2D data for analysis shown in Figure 1.

Figure 1. 2-D demonstration of data used in SVM method

We want to find the best parameter value C with using 2-fold cross validation (meaning use 1/2 data to train, the other 1/2 to test) and linear kernel (-t 0).. After finding the best parameter value for C, we train the entire data. The data set are downloaded from website link saved in Matlab work folder.

The data used for the analysis is downloaded from the website of LIBSVM. Diabetes-scale data is selected for the analysis through LibSVM method incorporating with Matlab Simulation. Input data is considered as indicator of diabetes and output data is considered for scale of diabetes measuring.

We have solved the problem of obtaining the scale of diabetes as final outcomes in diabetes measurements from the supplied input of diabetes indicator through SVM method.

A table of the cross-validation training results for the all the different C and Gamma values combination are as follows:

		C
		0.03125	0.0625	0.125	0.25	0.5	1	2	4	8	16	32	64	128
g	0.03125	69	69.5	70	70	70.5	71	72	74	73	71	69	67.5	68
	0.0625	70	71.5	72	72.5	73	73	73.5	75	74	72	69	67.5	68
	0.125	71.5	72	72	72.75	73.5	74	74.5	75.5	74.5	74	72	70	68.5
	0.25	72	72.5	73	74	74	75	76	77	75	74.5	72.5	71	69
	0.5	72.5	73	73.25	75	75	75.5	77	78	76	75	73	72	70
	1	74	74.25	74	74.5	75.5	76	77.5	79	77	76	74	72.5	71
	2	73	73.75	74	74	74.5	77.5	77.5	77.75	75.5	74.5	73	72	70
	4	73	73.5	73.5	73.5	74	76	76.5	77	74	73	72.5	71.5	69.5
	8	72.5	73	73.25	73.25	73.75	75	75.5	76	73	73	72	71	69
	16	71	72	73	73.25	73.5	74	75	76.5	72.5	72	71.5	70.5	68.5
	32	70	71.5	72	72.5	73	73.5	74	74	72	72	71	70	68
	64	69.5	70	71	72	72.5	72.5	73.5	73.5	71.5	71.5	70.5	68.5	67
	128	69	69	70	71	71	72	72.5	73	71	71	70	68	67

Best C : 4

Best Gamma value is determined as 0.99 (~1)

The highest accuracy : 79.02 %

Time taken for the whole cross- validation training : 5.87e+0 second

Results of SVM class

In order to make an LS-SVM model, we need 2 extra parameters:
gamma (gam) is the regularization parameter,
determining the trade-off between the fitting error minimization
and smoothness. In the common case of the RBF kernel,
sigma^2 (sig2) is the bandwidth. The parameters of SVM method
for dual class are shown in Table1.

Table 1. The parameters for using in SVM method with two class

gam	10
Sig2	0.2
type	classification
[alpha,b]	trainlssvm({X,Y,type,gam,sig2,'RBF_kernel'})

Results for the SVM multiclass

The simulation on SVM multiclass is done
with the cooperation of LibSVM and MATLAB.
The results of simulation are revealed sequentially.

1. Coupled Simulated Annealing results: [gam] 0.98774

[sig2] 40.2467

F(X)= 0.17

2. Optimization routine: simplex

cost function: crossvalidatelssvm

kernel function RBF_kernel

3. starting values: 0.987738 40.2467

Iteration Func-count min f(x) log(gamma) log(sig2) Procedure

1 3 1.700000e-001 -0.0123 3.6950 initial

2 5 1.700000e-001 -0.0123 3.6950 contract inside

3 9 1.700000e-001 -0.0123 3.6950 shrink

4 11 1.700000e-001 -0.0123 3.6950 contract outside

Simplex results:

X=0.987738 40.246706, F(X)=1.700000e-001

Obtained hyper-parameters: [gamma sig2]: 0.987738 40.2467

Multidimensional output at Tuning time 5.875000e+000

Accuracy: 79.02 %

Discussion

The multiclass classification case is more delicate, as many of the
algorithms are introduced basically incorporating with SVM method
through LIBSVM tools. In this short survey we investigate the
techniques for solving the multiclass classification problem.
Support Vector Machines are among the most robust and
successful classification algorithms. They are based upon the idea
of maximizing the margin i.e. maximizing the minimum distance
from the separating hyperplane to the nearest example.
In these extensions, additional parameters and constraints are added
to the optimization problem to handle the separation ofthe different classes.
The formulation of SVM method results in a large optimization problem,
which may be impractical for a large number of classes.
On the other hand, multiclass classification of SVM methodreports
a better formulation with a more efficient implementation.

Thursday, February 10, 2011

Using LIBSVM as a tool for classification and prediction