File Exchange

## Support Vector Regression

version 1.3 (3.45 KB) by

A MATLAB implementation of Support Vector Regression (SVR).

3.28571
10 Ratings

Updated

Support Vector Regression is a powerful function approximation technique based on statistical learning theory. The method is extremely robust and provides excellent generalization performance while still being able to capture complex relationships in the input data.
This implementation uses the optimization toolbox's sqp solver to minimize the e-insensitive empirical risk functional and regularization term to find the support vectors and their weights.
The purpose of the submission is to provide a "bottom-up" implementation that demonstrates how Support Vector Regression can be implemented in the MATLAB language and to allow the user to experiment with different kernel functions and optimization training algorithms at low-level.

TODO:
- Make the training independent of the optimization toolbox

Charles

### Charles (view profile)

Exciting as I am new to SVM and wish to try and apply to Forex markets. I see there is plenty of literature out there so thank you for the code.

NISHA K G

Ronald Clark

### Ronald Clark (view profile)

Hi Ruocheng,

I am aware of this issue but haven't had the time to update the files.

I'll update the code soon with some new features too!

As of 2015 Matlab does have a standard implementation with documentation available here https://www.mathworks.com/help/stats/support-vector-machine-regression.html which you can use.

Ronnie

Ruocheng Guo

### Ruocheng Guo (view profile)

Hi Ronnie,

Your code is not producing correct value for alpha as you don't have a constraint to force z[1:ntrain] = z[ntrain+1:2*ntrain] for alpha2 == 0 or z[1:ntrain] = -z[ntrain+1:2*ntrain] for alpha1 == 0. So you need to add a constraint for this.

And the second part of upper bound should be c*ones(ntrain,1) instead of 2*c*ones(ntrain,1) as it is either alpha1 or alpha2, should be in range from 0 to c.

Ronald Clark

### Ronald Clark (view profile)

Hi Nico M, the main difference is that my script is completely self contained (easy to learn from and adapt for research purposes).

Nico M

### Nico M (view profile)

Dear Ronnie Clark,

Thanks for sharing.
Because I've just started assessing what MATLAB files are available on SVM, I was wondering: What are the differences with the build in MATLAB function 'CompactRegressionSVM'?

Nico

Su Wutyi Hnin

### Su Wutyi Hnin (view profile)

Dear Ronnie Clark,
Thank you for new optimization script.
i would like to ask one thing.
After we optimize the three parameters , i don't know how to set up that parameters in PREDICT function.
Or it has another function?

Su Wutyi Hnin

### Su Wutyi Hnin (view profile)

Dear Ronnie Clark,
Thank you for new optimization script.
i would like to ask one thing.
After we optimize the three parameters , i don't know how to set up that parameters in PREDICT function.
Or it has another function?

Justin Igwe

### Justin Igwe (view profile)

Hello @Ronald, your code seems helpful because its predictions follows exactly same trend as my target trend. Unfortunately, the predicted values are all around 20% less than the actual values, which makes MSE in prediction to be above 3500. How can i correct this error? Thanks

Ben Tarrahi

### Ben Tarrahi (view profile)

Hi Tom, try making the target vector (y) zero mean, I tested the same data and prediction is almost perfect.

Tom

### Tom (view profile)

I used the data (x=[1 2 3 4 5 6 7 8 9 10]', y=[1 2 3 4 5 6 7 8 9 10]'), x is for both training and predicting. The predicted result is [-1.7568 -0.7587 0.2414 1.2389 2.2402 3.2354 4.2407 5.2294 6.2448 7.2202]
Why the predicted result is always about 2.7 lower then the actual value? Any idea?

Su Wutyi Hnin

### Su Wutyi Hnin (view profile)

i would like to know matlab code example for hybrid genetic algorithm and support vector regression

CHlin

### CHlin (view profile)

Hi Ronald Clark, Please could you advise how to use my data in training the SVM. My data is reliability data which is function of time
for example
R(t1)=.9,R(t2)=.89, . . . . R(t30)=.97
and t1=.4, t2=.7, t3=1.5,.......t30=9
so I was woundring what the input should be to x and y. could be R(ti) for x and ti for y?
thank you very much

Ronald Clark

### Ronald Clark (view profile)

Hi yzc233, do you have the optimization toolbox?

Hi ats, the easiest way to find the parameters is to do a grid search over a range of values.

ats

### ats (view profile)

I am wondering, how could we specify the optimal value for Cost and Gamma?

yzc233

### yzc233 (view profile)

Undefined function 'optimoptions' for input arguments of type 'char'.?

Patrick Bukenya

### Patrick Bukenya (view profile)

Hi Clark,
Thanks for the wonderful code, I would like to know how you analyse the figures that are produced by the code and what is the best way of crossvalidation to choose the best C, and epsilon.
thanks
Patrick

Ronald Clark

### Ronald Clark (view profile)

The code as you gave it works for me:

xdata =[198 187 184 178 166 150 144 145 181 ];
ydata =183;
c=400 ;
epsilon=0.00000025
kernel='gaussian'
varargin=0.5

-----------------

svrobj = svr_trainer(xdata,ydata,400,0.000000025,'gaussian',0.5);

Please try the command as above ^ and let me know what happens. Also, check that you have not accidentally made xdata or ydata a cell array.

You would obviously need more than one training point to give useful predictions!

Tak120

### Tak120 (view profile)

Hello Clark,
Thanks for sharing this file,
i tested this code for time series prediction, so i put:
xdata =[198 187 184 178 166 150 144 145 181 ];
ydata =183;
c=400
epsilon=0.00000025
kernel='gaussian'
varargin=0.5

but i have this : Undefined operator '-' for input arguments of type 'cell'.

Error in svr_trainer>@(x,y)exp(-lambda*(norm(x.feature-y.feature,2)^2)) (line 26)
kernel_function = @(x,y) exp(- lambda * (norm(x.feature-y.feature,2)^2));

Error in svr_trainer (line 54)
M = arrayfun(kernel_function,xi,xj);

Thank you!

Ronnie Clark

### Ronnie Clark (view profile)

Hi Purushottam

I think it is a problem with your input data (hence varargin).

Could you give an example of the input your are supplying?

Purushottam Sawant

### Purushottam Sawant (view profile)

hi,thanks for sharing this file.
i am getting error in optimoptions and varargin. what is solution to solve this type of error?

zhenhui xiang

### zhenhui xiang (view profile)

Hi,thanks for sharing this file, do you have any suggestion on how to modify your code to make it able to take multiple inputs features? Thanks!

fatimah mohammed

### fatimah mohammed (view profile)

Hi, thanks for sharing this file
I am working in Activity Recognition project, and I would like to use your file to do that is it possible?

Also, can I enter my train and test data of accelerometer files into your code to classify the activities?

thanks and regards
Fatimah mohammed

Said Hassan

### Said Hassan (view profile)

Hi Clark,
Thanks for sharing the file.
I'd like to know if this product can be used in predicting concentrations from absorbance data ??
my e-mail: said.hassan@pharma.cu.edu.eg

Dear Ronnie Clark,

Thank you so much for the shared function for svm (svr_trainer). I find it very helpful. However, I have two queries about it:

1/ Does the function consider the complex numbers ?
2/ I have the following data;
x_train: 2x1 (row vector)
y_train: 2x72 (Matrix with 2 culumns and 72 lines)
how can I input this data ? the function keeps telling me (Dimensions of matrices being concatenated are not consistent.)

Please I will highly appreciate your answer since this is related to my PhD defense and I really need to finish it as soon as possible.

Thank you in advance, Looking forward to hearing from you.

Best.

Ronald Clark

### Ronald Clark (view profile)

Hi Atiyo, yes you can have a multi dimensional output but it won't model dependencies between the individual output variables and the input i.e it will treat the output as a single element.
----------
Hi Royi, you can just use a linear kernel, however you need to have some means of evaluating similarity so you do need some kernel.

Royi Avital

### Royi Avital (view profile)

Hi Ronnie,
Simple question, what if I don't want to apply any Kernel on the data (Or chose the identity function as the Kernel)?

It seems I have to select a Kernel in your code.

Is there an issue with having no Kernel?
Is there no sense in going Kernel free?

Thank You.

Atiyo Banerjee

### Atiyo Banerjee (view profile)

Dear Ronnie Clark,
Thanks for your file exchange. I would like to ask you if the code is suitable for multi-output regression? If not, then how can I adapt it to a MIMO situation?

Hui Song

### Hui Song (view profile)

Dear Ronnie Clark, thanks for your file exchange
I want to know which are the output of the training and the prediction, respectively?
Thanks.

Bhupendra Sahare

### Bhupendra Sahare (view profile)

Dear Ronnie Clark, thanks for your file exchange

I have been experiencing a problem with prediction function,,when i am predicted output from the training input it will give great results,, but when i use other data rather than training input it will give same output for every point which is wrong prediction. can you help me out on this???

Thiago

### Thiago (view profile)

Dear Ronnie Clark,
thanks for sharing with us your code. I see you have been maintaining this for a while now. I have been experiencing a strange behavior: every fit I make with your function gives me an additional shift. You can generate a very simple dataset (1d straight line for instance) and when you plot the prediction against the original data you'll see this (vertical) shift. My guess is that b is either ill-defined or being lost somewhere. Sorry I couldn't debug further.

If you can check this, it'd be fantastic!
Th

julienD

### julienD (view profile)

Hi,
Thank you for the file, How you choose epsilon and c parameter ?

amelite

### amelite (view profile)

I have the same question about the f/2 at the end of the file. The predicted result seems half or the ground truth data.

Nur

### Nur (view profile)

Hi,
I have problem to define function 'optimoptions'.

Zheng

### Zheng (view profile)

Thank you, OP. But be careful to use the code, there are some bugs. For, example, H=0.5%[M...]
and f=f/2. The most critical one is that after augment the variable space, there should be constrain between the variables.

mostafa

### mostafa (view profile)

Hi
i want "stock price prediction using support vector regression"
code in matlab
can you help me ?

Charles Zhao

### Charles Zhao (view profile)

Hi,

I have been looking at the kernels available in your code, and the implementation of the tanh kernel puzzles me. You have it as:

prod(tanh(a*x.feature'*y.feature+c))

when it should be:

tanh(a*x.feature*y.feature'+c)

(since x.feature is a row vector)?

Have I misunderstood the nature of the kernel? Should prod only be there for spline?

Minho Xu

### Minho Xu (view profile)

HI Clark
I have some question about the f/2 at the end of the file . Why do we want to take the half of the prediction ?
Can you please explain Thank you

Ronnie Clark

### Ronnie Clark (view profile)

c is the cost associated with the training errors and epsilon defines when the errors start to be penalised (ie. the 'insensitive region'). There is really no hard rule to set these values - they're typically found using an exhaustive search that gives the best cross validation rates.

Matthias

### Matthias (view profile)

I'm new to SVR. What are c and epsilon? What are reasonable ranges (maybe relative to the range of your y)?

Ronnie Clark

### Ronnie Clark (view profile)

That is strange, could you please give a small sample of the data you're using (ie. All_Train', All_target', vargargin..)?

Coo Boo

### Coo Boo (view profile)

Hi
I tried the new version and I got the following warning and error message:
Warning: Cannot use sparse matrices with sqp algorithm: converting to full.
> In fmincon at 477
In svr_trainer at 49
In Train_FL_SVR_Reg at 40
Error using .*
Matrix dimensions must agree.
Error in svr_trainer/W (line 91)
cost = sum(alpha.*ydata - epsilon*abs(alpha));
Error in fmincon (line 640)
initVals.f = feval(funfcn{3},X,varargin{:});
Error in svr_trainer (line 49)
alpha = fmincon(@W, alpha0, [],[],Aeq, beq, lb, ub,[],options);
Error in Train_FL_SVR_Reg (line 40)
net = svr_trainer(All_Train',All_Target',c,epsilon,kernel,varargin);
Caused by:
Failure in initial user-supplied objective function evaluation. FMINCON
cannot continue.

Ronnie Clark

### Ronnie Clark (view profile)

Hi Omari. Yes, on the surface it will work just like with ANN.

Omari

### Omari (view profile)

Hi Ron, Today is my first day reading about SVR, I work in prediction of thermal stresses with ANN, now i want to work with SVR. I would like to ask you if is possible to predict thermal stresses with svr like with ANN? I work with about 15 inlets. thanks in advance Ron, regards

Everton

### Everton (view profile)

That's right. I misunderstood the results, sorry.
Thanks!

Ronald Clark

### Ronald Clark (view profile)

Hi Everton

The XOR you posted does work. I get:

[0,0] -> 8.3824e-04 (ie. 0.00083824)
[1,0] -> 0.9991
[0,1] -> 0.9991
[1,1] -> 8.3824e-04

Its not exact but very close.

Everton

### Everton (view profile)

Hi
is SVR supposed to work on a simple XOR example? What am I doing wrong? Thanks

epsilon=0.000025;
c=40000;
xdata = [ 0 0 ; 0 1 ; 1 0; 1 1 ]';
ydata = [ 0; 1 ; 1 ; 0 ]';
[~, alpha0,b0] = svr(xdata,ydata,[],[],[],c,epsilon);
result = svr(xdata,[], [ 0 ; 0 ], alpha0, b0,[],[]);
disp(result) % 8.3789e-04 expected: 0

Ronnie Clark

### Ronnie Clark (view profile)

Hi Ion

They are (n,1) where n is the dimensionality of your data :)

Ion Vasile

### Ion Vasile (view profile)

Hi Ronnie!

I want to ask you what are the dimensions fo xdata and ydata?

Thank you

Ronnie Clark

### Ronnie Clark (view profile)

Yes you are correct, this implementation uses the substitution alpha=alpha1-alpha2, abs(alpha)=alpha1+alpha2 and alpha1*alpha2=0 which makes performing the optimization easier. This formulation of the dual problem is well documented in the literature so I do believe it is correct - see for example "An Introduction to Support Vector Machines and Other Kernel-based Learning Methods (28 March 2000) by Nello Cristianini, John Shawe-Taylor", pg 116.

Secondly, yes, using a matrix formulation for the objective function will be faster - I just haven't had time to implement it.

Regards!
Ronnie

Xingxin

### Xingxin (view profile)

Your dual problem formulation doesn't seem correct. There should be two Lagrangian multipliers, alpha1 and alpha2, associated with the lower and upper bound constrains in the primal problem. The objective function for the dual problem is：
-0.5（alpha1-alpha2)'*K*(alpha1-alpha2)+(alpha1-alpha2)'y-epsilon*(alpha1+alpha2)'1
where 0<=alpha1,alpha2<=C, and (alpha1-alpha2)'1=0.

In your svr function, it seems you changed the variable alpha=alpha1-alpha2 and used abs(alpha) for alpha1+alpha2. This does not look correct.

I am also stuck using matlab optimization tools such as fmincon or quadprog to implement SVR where 2 arguments are in the objective function.

Also, I notice the objective function calculates with for-loops. This is much slower than using matrix operation.

Rahul

### Rahul (view profile)

while i have used this one iam getting the"Undefined function 'svr' for input arguments of type 'double'" this type of error please anyone tell me the Sollution And i hope to send solution to 'rahulmekala@gmail.com'also..

Rahul

### Rahul (view profile)

can u tell me can i use this code for fisheriris Data...
ii have tried but i could't get the result...
what i have to do now??
urgently..

tayari

### tayari (view profile)

hello again
how you set the values ​​xdata, ydata, X1 and Y1

Ronnie Clark

### Ronnie Clark (view profile)

Hi

It's the number of training vectors.

tayari

hello
thank you

Ronnie Clark

### Ronnie Clark (view profile)

Ok, sunspot.dat contains yearly sunspot activity data measured by the Wolf relative sunspot number.

So, assuming you have the data x(n), x(n-1)...x(1) and what you want to forecast is x(n+1), you have to decide how you want to forecast this value ie. what data do you have that could possibly give you an accurate prediction? Now, as long-term sunspot activity is typically cyclical, a possible set of input data (xdata) to use might be a vector of past values x(n), x(n-1)...

The output (ydata) is then the desired sunspot activity forecast, x(n+1) which in this case would be a single value.

However, using only endogenous variables as the input (ie. past load values) will probably not give an accurate prediction of the amplitudes of the sunspot activity. A better idea will be to augment the input data (xdata) with exogenous variables that are correlated to the sunspot activity such as solar wind data, magnetic storm activity, flare activity or radio and X-ray emission data, with the goal of allowing the SVR to model the relationship that exists between these variables and the relative sunspot number and hopefully achieving a more accurate forecast.

There is a small example file in the submission which shows how to use the function. Here is a bit more detailed explanation of the parameters:

alpha - is weight of each support vector (generated by the function on training)
beta - is a linear constant (like an offset in the svr model - also generated on training)
c - is the cost of the 'training errors' (user parameter that must be set)
epsilon - is the magnitude of the 'insensitive' region (user parameter that must be set)

Hope that helps!

David Franco

### David Franco (view profile)

Hello again Ronnie,

But, for example, if I have a series of past values (x(n), x(n-1), x(n-2)...). This series will be xdata or ydata? Could you please explain me the inputs arguments xdata, ydata, x, alpha, b, c, epsilon? Or maybe, help me to make a test file for time series forecasting... with sunspot.dat (it is included on Matlab) for example (I am newbie on Matlab).

Thank you very much sir!

Ronnie Clark

### Ronnie Clark (view profile)

Yes sure you can! Time series forecasting doesn't necessarily need a 1D input though - you can also use
1) A number of features extracted from the time series as the input or,
2) A series of past values as the input eg. x(n), x(n-1), x(n-2)....

David Franco

### David Franco (view profile)

Hello,

Can I use this code to make a regression with an 1D input?
For example: time series forecasting.

Thank you!

Ronnie Clark

### Ronnie Clark (view profile)

Thank you!

Yes, it should be possible to use fmincon. I will have a look and update the file shortly. I'm also planning on adding more kernel functions and cleaning up the code.

Charles Harrison

### Charles Harrison (view profile)

Ronnie,

Awesome function. Would it be possible to upload a version that doesn't require the optimization toolbox? I know you utilize multivariate constrained nonlinear optimization. I tried to merge John D' Ericos fmincon minimization into this but it didn't work out right.

Mehmet Pilgir