How can I fit linear data with a common intercept?

Hello,
I have a set of xdata and a matrix of ydata (many sets of ydata) that I would like to fit with an equation of the form ydata = m*xdata + b where the b parameter is shared between all sets.
I really can't seem to figure out how to do this. There are some suggestions about defining a new multivariable function grouping all the functions together (linked below) but I have many data sets to fit so this seems tedious.
Attached is a picture of what my data looks like with a simple linear regression on all data. You can see that the intercept is similar, but I want to force this to be the same while still being a fit parameter.
Thanks for any suggestions!

Answers (2)

One possibility is to create a vector from your dependent ‘ydata’ matrix, and replicate the independent ‘xdata’ vector to match it. Then estimate the regression parameters. I can’t provide an analytic proof that this is statistically valid, but I have no reason to believe it isn’t.
The code:
xdata = 1:10; % Independent Variable Vector
ydata = randi(99, 10, 5); % Dependent Variable Matrix
xdatavec = repmat(xdata', size(ydata,2), 1); % Replicated Independent Variable Vector
ydatavec = ydata(:); % Dependent Variable Vector
B1 = [ones(size(xdatavec)) xdatavec]\ydatavec; % Estimate Parameters
yfit1 = [ones(size(xdata')) xdata']*B1; % Calculate Regression Line
figure(1)
plot(xdata, ydata, 'p') % Plot Data
hold on
plot(xdata, yfit1, '-m') % Plot Regression Line
hold off
grid
ydatab = ydata - B1(1); % Subtract ‘B1(1)’ From ‘ydata’
B2 = xdata'\ydatab; % Estimate Slopes, Forcing Zero Intercept
yfit2 = xdata'*B2 + B1(1); % Regression Lines + ‘B1(1)’
figure(2)
plot(xdata, ydata, 'p') % Plot Data
hold on
plot(xdata, yfit2) % Plot Regression Lines
hold off
grid
The second part subtracts ‘B1(1)’ from all the data, forcing a zero intercept to them. Then estimates the individual regression parameters (only the slopes), and calculates and plots the individual regression lines.

6 Comments

I don't think this does what Taylor wants. This method will give one fitted slope and one fitted intercept. I believe what is wanted is one fitted intercept, but different slopes for each dataset.
I initially forgot about the second part, concentrating on calculating the common intercept. The edit corrects that omission.
Thanks for the response! I thought about doing this but the problem I have with this method is I am still forcing an intercept (in this case 0) when there might be a better intercept (i.e. 0.1). I really want the intercept to be a best fit parameter that is fitted between all data sets.
My pleasure!
Actually, it is as I calculate it.
In the first part, I calculate a common intercept for all data, doing a linear regression on ‘xdatavec’ and ‘ydatavec’ to get the intercept (in my code, ‘B1(1)’).
In the second part, I subtract that common intercept, ‘B1(1)’, from all the ‘ydata’ values to force a zero intercept to the individual data. I then calculate the zero-intercept slopes for each of the data vectors (columns of ‘ydata’), calculate a regression line for each data set (slope only, ‘B2(:)’) and add back the common intercept to all of them to create the final regression lines. So the common intercept is ‘B1(1)’, and not zero. The individual regression lines would then be defined as yfit=B1(1)+B2*xdata.
This is the only way I can think of to do it.
Oh I'm sorry I didn't quite understand the first part of your code then. Thanks I think this will work! I'll go take a closer look and try it out.
My pleasure!
The confusion’s partially my fault. By the time I was happy with the code, I was too tired to provide the detailed explanation I should have at the time.
The first figure plots the common regression line (just to illustrate it). The second figure plots the individual regression lines.

Sign in to comment.

If you have the Statistics and Machine Learning Toolbox, you can use the mvregress command to do this type of regression. You just need to specify the design matrix a bit carefully.
Take a look at my answer to this question. It has some well commented code that should help you understand what you need to do.

2 Comments

Hi,
I do have the statistics toolbox and mvregress is really close to being the right function. I had previously looked at your response and the documentation on mvregress but it seems like there are 3 options for the output of this function depending on what type of design matrix you specify:
1) different intercept and different slope terms 2) a different intercept but common slope terms 3) common intercept and common slope terms
but what I want is different slope but common intercept terms. Is there a way to do this with mvregress that I am just missing?
Thanks!
I added a design matrix (#4) to my example that implements common intercept, with different slopes.
table = [65 71 63 67; ...
72 77 70 70; ...
77 73 72 70; ...
68 78 75 72; ...
81 76 89 88; ...
73 87 76 77];
X = table(:,[1 2]);
Y = table(:,[3 4]);
[N,M] = size(Y);
% Because our target data is multidimensional, we need to first put our
% predictor data into an [N x 1] cell array, one cell per observation.
% (In this example, N=6.) Each cell contains the desired design matrix,
% with the intercepts and independent variables for that observation.
% In each cell, there is one row per dimension (M) in Y. (In this example, M=2.)
pred_cell = cell(N,1);
for i = 1:N,
% Choose ONE of the three design matrices below:
% % (1) For each of the N points, set up a design matrix specifying
% % different intercept and different slope terms. (This is equivalent to
% % doing y1 and y2 independently.) This will result in 6 betas.
% pred_cell{i,1} = [ 1 0 X(i,1) 0 X(i,2) 0 ; ...
% 0 1 0 X(i,1) 0 X(i,2)];
% % (2) For each of the N points, set up a design matrix specifying
% % a different intercept but common slope terms. This will result
% % in 4 betas.
% pred_cell{i,1} = [eye(2), repmat(X(i,:),2,1)];
% % (3) For each of the N points, set up a design matrix specifying
% % common intercept and common slope terms. This will result in 2 betas.
% pred_cell{i,1} = [repmat([1 X(i,:)],M,1)];
% (4) For each of the N points, set up a design matrix specifying
% common intercept and different slope terms. This will result in 5 betas.
pred_cell{i,1} = [ 1 X(i,1) 0 X(i,2) 0 ; ...
1 0 X(i,1) 0 X(i,2)];
end
% The result from passing the explanatory (X) and response (Y) variables into MVREGRESS using
% the cell format is a vector 'beta' of weights.
beta = mvregress(pred_cell, Y) %#ok<NASGU,NOPRT>
You should be able to adapt this example to your case.

Sign in to comment.

Categories

Asked:

on 18 Sep 2015

Commented:

on 21 Sep 2015

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!