Intelligent Systems
Dr. Abdul Rahim Ahmad
Using LIBSVM as a tool for classification and prediction
By
Omar Salah Farhan (ST21058)
Anes .A.Shaker (ST21059)
The contributions in this project work: We completed the work of the project together.
Analysis with LibSVM
Data is setup in to analysis with the tools of LibSVM. Our data contains two classes, each N samples. The data is given as and used as 2D data for analysis shown in Figure 1.
Figure 1. 2-D demonstration of data used in SVM method
We want to find the best parameter value C with using 2-fold cross validation (meaning use 1/2 data to train, the other 1/2 to test) and linear kernel (-t 0).. After finding the best parameter value for C, we train the entire data. The data set are downloaded from website link saved in Matlab work folder.
The data used for the analysis is downloaded from the website of LIBSVM. Diabetes-scale data is selected for the analysis through LibSVM method incorporating with Matlab Simulation. Input data is considered as indicator of diabetes and output data is considered for scale of diabetes measuring.
We have solved the problem of obtaining the scale of diabetes as final outcomes in diabetes measurements from the supplied input of diabetes indicator through SVM method.
A table of the cross-validation training results for the all the different C and Gamma values combination are as follows:
C | ||||||||||||||
0.03125 | 0.0625 | 0.125 | 0.25 | 0.5 | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 128 | ||
g | 0.03125 | 69 | 69.5 | 70 | 70 | 70.5 | 71 | 72 | 74 | 73 | 71 | 69 | 67.5 | 68 |
0.0625 | 70 | 71.5 | 72 | 72.5 | 73 | 73 | 73.5 | 75 | 74 | 72 | 69 | 67.5 | 68 | |
0.125 | 71.5 | 72 | 72 | 72.75 | 73.5 | 74 | 74.5 | 75.5 | 74.5 | 74 | 72 | 70 | 68.5 | |
0.25 | 72 | 72.5 | 73 | 74 | 74 | 75 | 76 | 77 | 75 | 74.5 | 72.5 | 71 | 69 | |
0.5 | 72.5 | 73 | 73.25 | 75 | 75 | 75.5 | 77 | 78 | 76 | 75 | 73 | 72 | 70 | |
1 | 74 | 74.25 | 74 | 74.5 | 75.5 | 76 | 77.5 | 79 | 77 | 76 | 74 | 72.5 | 71 | |
2 | 73 | 73.75 | 74 | 74 | 74.5 | 77.5 | 77.5 | 77.75 | 75.5 | 74.5 | 73 | 72 | 70 | |
4 | 73 | 73.5 | 73.5 | 73.5 | 74 | 76 | 76.5 | 77 | 74 | 73 | 72.5 | 71.5 | 69.5 | |
8 | 72.5 | 73 | 73.25 | 73.25 | 73.75 | 75 | 75.5 | 76 | 73 | 73 | 72 | 71 | 69 | |
16 | 71 | 72 | 73 | 73.25 | 73.5 | 74 | 75 | 76.5 | 72.5 | 72 | 71.5 | 70.5 | 68.5 | |
32 | 70 | 71.5 | 72 | 72.5 | 73 | 73.5 | 74 | 74 | 72 | 72 | 71 | 70 | 68 | |
64 | 69.5 | 70 | 71 | 72 | 72.5 | 72.5 | 73.5 | 73.5 | 71.5 | 71.5 | 70.5 | 68.5 | 67 | |
128 | 69 | 69 | 70 | 71 | 71 | 72 | 72.5 | 73 | 71 | 71 | 70 | 68 | 67 |
<>
Best C : 4 Best Gamma value is determined as 0.99 (~1) | ||||||||||||||||||||
The highest accuracy : 79.02 % Time taken for the whole cross- validation training : 5.87e+0 second Results of SVM class In order to make an LS-SVM model, we need 2 extra parameters: gamma (gam) is the regularization parameter, determining the trade-off between the fitting error minimization and smoothness. In the common case of the RBF kernel, sigma^2 (sig2) is the bandwidth. The parameters of SVM method for dual class are shown in Table1. Table 1. The parameters for using in SVM method with two class
Results for the SVM multiclass The simulation on SVM multiclass is done with the cooperation of LibSVM and MATLAB. The results of simulation are revealed sequentially. 1. Coupled Simulated Annealing results: [gam] 0.98774 [sig2] 40.2467 F(X)= 0.17 2. Optimization routine: simplex cost function: crossvalidatelssvm kernel function RBF_kernel 3. starting values: 0.987738 40.2467 Iteration Func-count min f(x) log(gamma) log(sig2) Procedure 1 3 1.700000e-001 -0.0123 3.6950 initial 2 5 1.700000e-001 -0.0123 3.6950 contract inside 3 9 1.700000e-001 -0.0123 3.6950 shrink 4 11 1.700000e-001 -0.0123 3.6950 contract outside Simplex results: X=0.987738 40.246706, F(X)=1.700000e-001 Obtained hyper-parameters: [gamma sig2]: 0.987738 40.2467 Multidimensional output at Tuning time 5.875000e+000 Accuracy: 79.02 % Discussion The multiclass classification case is more delicate, as many of the algorithms are introduced basically incorporating with SVM method through LIBSVM tools. In this short survey we investigate the techniques for solving the multiclass classification problem. Support Vector Machines are among the most robust and successful classification algorithms. They are based upon the idea of maximizing the margin i.e. maximizing the minimum distance from the separating hyperplane to the nearest example. In these extensions, additional parameters and constraints are added to the optimization problem to handle the separation ofthe different classes. The formulation of SVM method results in a large optimization problem, which may be impractical for a large number of classes. On the other hand, multiclass classification of SVM methodreports a better formulation with a more efficient implementation. |