linear regression on excel dataset

7 views (last 30 days)
I am new in MATLAB and have tried to do a linear regression with the code:
>> filename = 'C:\Users\Troels\Dropbox\Analyse & Resultater\MATLAB\Danmark OX.xls';
>> ds = xlsread(filename) // the dataset "Danmark OX.xls" is printet
>> mdl = LinearModel.fit(ds)
Error using classreg.regr.TermsRegression/handleDataArgs (line 629) Y argument is required unless X is a dataset.
Error in LinearModel.fit (line 891) [X,y,haveDataset,otherArgs] = LinearModel.handleDataArgs(X,varargin{:});
Have also tried the code:
>> ds = dataset('XLSFile','C:\Users\Troels\Dropbox\Analyse & Resultater\MATLAB\Danmark OX.xls','ReadObsNames',true);
Warning: Variable names were modified to make them valid MATLAB identifiers.
> In @dataset\private\genvalidnames at 56
In @dataset\private\setvarnames at 40
In dataset.readXLSFile at 49
In dataset.dataset>dataset.dataset at 352
>> mdl = LinearModel.fit(ds);
Warning: Regression design matrix is rank deficient to within machine precision.
> In TermsRegression>TermsRegression.checkDesignRank at 98
In LinearModel.LinearModel>LinearModel.fit at 944
Am I on the right track in any of my 2 attempts? And can anyone tell me from the above what I do wrong?
  1 Comment
dpb
dpb on 28 Jul 2013
Please reformat the code and remove the excess lines for legibility...
But, the first fails because the LinearModel requires a dataset and the result of xlsread() isn't one...
Second effort is ok from that standpoint but the error/warning indicates your data are poorly scaled -- lookfor "rank deficiency" in the online documentation for what that is if you don't know and possible workarounds.
Better would be to fix that problem w/ better design matrix but sometimes that's not possible.

Sign in to comment.

Accepted Answer

Shashank Prasanna
Shashank Prasanna on 28 Jul 2013
mdl = LinearModel.fit(ds)
assumes, ds is a dataset (Your second approach) and the last column of ds is the response variable. Is the true in your case?
If you want to pass data as a matrix then you have to do so this way:
mdl = LinearModel.fit(X,y)
I urge you to read the documentation of LinearModel.fit to understand how to call it. This will save you a lot of time later on. There are plenty of examples there:

More Answers (1)

T27667
T27667 on 28 Jul 2013
Ohh, my second attempt actually worked and the estimates and so on were produced in spite of the warnings.
I miss some output however compared to the output from another statistical software (OxMetrics) which gives:
Coefficient Std.Error t-value t-prob Part.R^2
Constant 1.36882 0.2155 6.35 0.0000 0.0708
Index -0.936838 0.06085 -15.4 0.0000 0.3090
Institutionel -0.515027 0.06092 -8.45 0.0000 0.1188
Risk 0.0352812 0.004220 8.36 0.0000 0.1165
Equity 0.248302 0.05861 4.24 0.0000 0.0328
Allocation 0.375370 0.05182 7.24 0.0000 0.0901
Performance fee 0.0138219 0.007027 1.97 0.0497 0.0072
Share Class TNA -0.0323253 0.01039 -3.11 0.0020 0.0179
sigma 0.33081 RSS 58.0007371
R^2 0.614132 F(7,530) = 120.5 [0.000]**
Adj.R^2 0.609035 log-likelihood -164.218
no. of observations 538 no. of parameters 8
mean(Y) 1.22317 se(Y) 0.529066
Normality test: Chi^2(2) = 44.367 [0.0000]**
Hetero test: F(10,527) = 2.5403 [0.0054]**
Hetero-X test: F(13,524) = 2.0211 [0.0175]*
RESET23 test: F(2,528) = 5.7023 [0.0035]**
How can I produce the various tests in MATLAB that OxMetrics provide above?
  1 Comment
Shashank Prasanna
Shashank Prasanna on 28 Jul 2013
Edited: Shashank Prasanna on 28 Jul 2013
I've never used OxMetrics, but you may want to check the documentation of LinearModel.fit for all the calculated goodness of fit statistics. If you don't find a specific one it may be available separately in the Statistics Toolbox.

Sign in to comment.

Categories

Find more on Descriptive Statistics in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!