Scatterplot and Correlation of table variables

4 views (last 30 days)
Hello there
Q = readtable("DA.xlsx")
Q = 20×18 table
Tam Az Clp Dp DH DN EB GH Pw Sd Prs WD WV AD ZT La Lo Al ____ ____ ____ ____ ___ ___ ___ ___ ____ __ _____ ___ ___ ____ ___ _____ ______ ___ 22.7 -143 41 21.8 0 0 0 0 3.44 0 916.2 294 1 0.11 155 4.156 9.2632 707 22.5 -126 21.4 21.7 0 0 0 0 3.37 0 915.7 291 1 0.11 144 4.156 9.2632 707 22 -118 0 21.3 0 0 0 0 3.3 0 915.5 299 0.9 0.11 132 4.156 9.2632 707 21.8 -114 0 21.1 0 0 0 0 3.26 0 915.7 301 0.9 0.11 118 4.156 9.2632 707 21.6 -113 0 20.9 0 0 0 0 3.22 0 915.9 295 0.7 0.11 104 4.156 9.2632 707 21.7 -113 0 21 7 0 0 7 3.16 0 916.1 290 0.6 0.11 91 4.156 9.2632 707 22.2 -115 0 21 81 27 27 84 3.12 0 916.7 295 0.5 0.11 77 4.156 9.2632 707 23.6 -118 13.4 21.5 183 168 134 190 3.14 0 917.4 36 0.3 0.11 64 4.156 9.2632 707 25 -125 22.2 21.8 311 272 214 311 3.2 0 917.5 102 0.9 0.11 51 4.156 9.2632 707 26 -136 20.1 21.4 457 371 347 457 3.27 0 917 120 1.1 0.11 39 4.156 9.2632 707 27 -154 21.8 21.1 536 465 365 536 3.32 0 916.4 133 1.1 0.11 31 4.156 9.2632 707 27.6 178 22.6 21 566 508 488 566 3.36 0 915.8 154 1.2 0.11 28 4.156 9.2632 707 27.6 151 30.5 21.2 506 442 441 506 3.38 0 915 175 1.2 0.11 32 4.156 9.2632 707 27.5 134 42.7 21.3 361 360 354 361 3.39 0 914.4 199 1.1 0.11 41 4.156 9.2632 707 27.2 124 32.4 21.2 313 246 211 313 3.43 0 914 218 1.3 0.11 53 4.156 9.2632 707 26.6 118 17.8 21 209 119 98 221 3.46 0 913.8 233 1.6 0.11 65 4.156 9.2632 707
x=[Q.Tam,Q.Az,Q.Clp,Q.Dp,Q.DH,Q.DN,Q.EB, Q.Pw ,Q.Sd,Q.Prs,Q.WD,Q.WV,Q.AD, Q.ZT, Q.La, Q.Lo, Q.Al];
y=[Q.GH];
mdl=fitlm(x,y)
Warning: Regression design matrix is rank deficient to within machine precision.
mdl =
Linear regression model: y ~ 1 + x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x10 + x11 + x12 + x13 + x14 + x15 + x16 + x17 Estimated Coefficients: Estimate SE tStat pValue __________ ________ _________ _________ (Intercept) 0 0 NaN NaN x1 2.0783 5.6941 0.36499 0.7501 x2 0.0055895 0.018668 0.29942 0.79287 x3 -0.079915 0.20744 -0.38525 0.73717 x4 -4.7595 6.9256 -0.68723 0.56293 x5 1.044 0.053721 19.434 0.0026374 x6 -0.068095 0.062486 -1.0898 0.38962 x7 -0.0045757 0.0573 -0.079855 0.94362 x8 22.631 45.981 0.49218 0.67131 x9 0 0 NaN NaN x10 0.84604 3.6954 0.22894 0.84019 x11 -0.033195 0.032357 -1.0259 0.41281 x12 -9.5612 9.7954 -0.97609 0.43196 x13 0 0 NaN NaN x14 0.043679 0.14711 0.29692 0.79452 x15 0 0 NaN NaN x16 0 0 NaN NaN x17 -1.1029 4.9284 -0.22378 0.84371 Number of observations: 20, Error degrees of freedom: 7 Root Mean Squared Error: 3.03 R-squared: 1, Adjusted R-Squared: 1 F-statistic vs. constant model: 7.6e+03, p-value = 3.38e-13
corrplot(Q)
Error using corrplot
One or more variables show no variation. Correlations undefined.
Please why does the pvalue for X9, X13, X15, X16 return NaN and not X17 even though X17 have the same values all through.?
Also how do I get how do I get scatter plot and how can I fix the correlation of Input variables with output GH?

Accepted Answer

Pratyush
Pratyush on 16 Jun 2023
Hi Mbunya.
It is my understanding that you are trying to fit a linear model to the data stored in x with y as the ground truth. However, you are getting NaN as the value of some weights after fittting a linear model. I have analysed your data and I feel the problem seems to be occuring because of the following reason.
Some columns in your data - Sd, AD, La, Lo and Al have the same value at all data samples. There are two problems that will occur because of this while fitting a linear model.
  1. The matrix used to fit a linear model by will become rank deficient. When we try to fit a linear model with the above mentioned fields, MATLAB will throw a warning that the Regression design matrix is rank deficient.
  2. Also, the above mentioned fields do not add any value to the model fitting. If some property is same at all data samples, its value has no importance in deciding the output of the model for any sample. It is a constant property.
I retrained the model using your data after removing the above mentioned constant properties as shown below.
x=[Q.Tam,Q.Az,Q.Clp,Q.Dp,Q.DH,Q.DN,Q.EB, Q.Pw ,Q.Prs,Q.WD,Q.WV, Q.ZT];
y=[Q.GH];
mdl=fitlm(x,y)
This time no NaN vaues were observed for any weights or the intercepts in the linear model as shown in the image below. Hope this resolves your problem.
Here are some additional resources to help you with linear regression.

More Answers (0)

Products


Release

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!