How do I obtain regression coefficients from a large data set?

on 11 Nov 2012

Tom Lane (view profile)

I'm looking to obtain regression coefficients from three predictor variables (e.g. alpha1rad,alpha2rad, alpha3rad), where each variable is [101 x 20] (i.e. 101 data frames and 20 trials).

In matlab:

mn = mean(data,2);

dev = data-repmat(mn,1,N);

For one point in time my data was a [1 x 20] (i.e. one data frame for 20 trials) where each predictor variable (x) was:

x1 = dev (1,:); x2 = dev (2,:); x3 = dev (3,:);

before defining X as:

X = [ones(length(x1),1) x1' x2' x3'];

Therefore, how can I define X using the larger data sets (i.e.[alpha1rad(i,:);alpha2rad(i,:);alpha3rad(i,:); and would this require a for loop such as:

for i=1:101;

`    data = [alpha1rad(i,:);alpha2rad(i,:);alpha3rad(i,:);`

end

Tom Lane

Tom Lane (view profile)

on 12 Nov 2012

Can you explain more about what you need? Here you appear to be getting a 20-by-4 matrix X using one row from three 20-column arrays (and defining a column of ones). When you say you want to use 101-by-20 arrays, what size do you want X to be?

Tim Bennett

Tim Bennett (view profile)

on 12 Nov 2012

Thanks for responding.

Like you say, the X for one point in time gave me a 20-by-4 matrix where each variable consisted of a [1 x 20] array. However, i'm looking to obtain coefficients for a whole movement trial now where each variable (e.g. x1) is a [101 x 20] array so i'm not what the size of X will be myself and how I could obtain the regression coefficients from this larger data set? The outcome variables (x, y, z) are also a [101 x 20] array so I was planning on using the following for loop:

for i=1:101; data = [x1(i,:);x2(i,:);x3(i,:)

```    Y = [x(i,:);y(i,:);z(i,:)];
end```

to apply the following code to each row of data as a function:

[M,N] = size(data);

mn = mean(data,2);

dev = data-repmat(mn,1,N);

x1 = dev (1,:);

x2 = dev (2,:);

x3 = dev (3,:);

before creating X using:

X = [ones(length(x1),1) x1' x2' x3'];

Tim Bennett

Tim Bennett (view profile)

on 21 Nov 2012

Just to add to my last message, I'm using the following code to obtain the coefficients 4 predictor variables and two outcome variables:

function [VUCM,VUCMp,J]=regsoccer2(data,Y,d)

% Predictor variables

[M,N] = size(data);

mn = mean(data,2);

dev = data-repmat(mn,1,N);

x1 = dev (1,:);

x2 = dev (2,:);

x3 = dev (3,:);

x4 = dev (4,:);

% Y output variables

mnY = mean(Y,2);

devY = Y-repmat(mnY,1,N);

X = [ones(length(x1),1) x1' x2' x3' x4'];

B = X\devY';

J = B';

Z = null(J);

I'm also using the following umbrella code:

for i=1:101;

`    data = [x1(i,:);x2(i,:);x3(i,:);x4(i,:)];`
`    Y = [x(i,:);y(i,:)];`
`    [perp,para,J]=regsoccer1(data,Y,d);`
`    VUCM(i)=[perp];`
`    VUCMp(i)=[para];`

end

Thanks

Products

No products are associated with this question.

Tom Lane (view profile)

on 14 Nov 2012

Perhaps you can build on this. Here I set up some fake data with a known relationship with a single outcome variable. Then I loop over all rows and compute the coefficients, and assemble them into a coefficient matrix. I look at the first few to make sure they capture the known relationship.

```>> x1 = rand(101,20);
>> x2 = rand(101,20);
>> x3 = rand(101,20);
>> trial = (1:101)';
>> y = repmat(trial,1,20) + x1 + 2*x2 + 3*x3 + randn(101,20)/10;
>> b = zeros(4,101);
>> for j=1:101
X = [ones(20,1),x1(j,:)',x2(j,:)',x3(j,:)'];
Y = y(j,:)'; b(:,j) = X\Y;
end
>> b(:,1:5)
ans =
1.0319    1.8816    3.0233    4.0347    5.0205
0.9892    1.0496    1.0443    1.0919    0.9576
2.0266    2.0864    1.9609    1.9148    1.8656
2.9049    3.1052    2.9136    2.9238    3.1082
```

You could embellish this to add more outcome variables (more columns of the Y matrix) and to subtract means at any point.

Tim Bennett

Tim Bennett (view profile)

on 21 Nov 2012

I've managed to calculate the coefficients for all 100 frames over 20 trials. This works using 3 predictor variables (each a [100 x 20] array and 2 outcome variables [again both a [100 x 20] array) giving [3 x 2] arrays of coefficients for each frame. The coefficents are then transposed into a Jacobian (J) and the null of J (Z = null(J)) is used for further analysis.

However, when I add an extra predictor variable (x4 = [100 x 20] array) I get the following response:

??? Error using ==> mtimes

Inner matrix dimensions must agree.

Error in ==> regsoccer1 at 50

UCM(:,i) = (Z'*dev(:,i))*Z;

Any help would be appreciated.

Tim Bennett

Tim Bennett (view profile)

on 22 Nov 2012

Just adding to my last comment, I used the following function:

function [VUCM,VUCMp,J]=regsoccer1(data,Y,d)

% Predictor variables

[M,N] = size(data);

mn = mean(data,2);

dev = data-repmat(mn,1,N);

x1 = dev (1,:);

x2 = dev (2,:);

x3 = dev (3,:);

% Y output variables

mnY = mean(Y,2);

devY = Y-repmat(mnY,1,N);

X = [ones(length(x1),1) x1' x2' x3'];

B = X\devY';

J = B';

Z = null(J);

and the following umbrella code:

for i=1:101;

`    data = [x1(i,:);x2(i,:);x3(i,:)];`
`    Y = [x(i,:);y(i,:)];`
`    [perp,para,J]=regsoccer1(data,Y,d);`
`    VUCM(i)=[perp];`
`    VUCMp(i)=[para];`

end

Tim Bennett

Tim Bennett (view profile)

on 25 Nov 2012

Hi Tom,

if you're not already tired of my questions I have a bit more information to hopefully make a bit more sense. Any help would be appreciated.

I'm trying to see how joint angles of the right leg (predictor variables, X = x1, x2, x3 for the hip, knee, and ankle angles respectively each a [1 x 20] array) could potentially stablise the position of the right foot (outcome variables Y = x,y coordinate positions: both a [1 x 20] array) using linear regression.

"dev" (a [1 x 20] array for each predictor variable) is the deviations of joint angles from the mean joint angle configuration at each trial and projected onto the null-space or null(J)(Z = null(J)) using the following code:

for i = 1:N

UCM(:,i) = (Z'*dev(:,i))*Z;

end

The UCM is used to look at the control of a movement and is approximated linearly using the null space (Z) of the J matrix.

This code works for one frame (i.e. a [1 x 20] array) and a whole normalised movement cycle (i.e. a [101 x 20] array) for 3 predictor varibles (PV's) and 2 output variables (OV's) from a [3 x 1] Z array.

However when I increase the number of PV's to 4 (resulting in a [4 x 2] Z array) and 5 (resulting in a [5 x 2] Z array) with 2OV's, I get the following warning:

???Error using mm>mtimes

Inner matrix dimensions must agree

Therefore, i'm unable to analyse any data above three PV's.

Join the 15-year community celebration.

Play games and win prizes!

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi