Skip to Main Content Skip to Search
Login
File Exchange
MATLAB Newsgroup
Link Exchange
  Blogs  
 Contest 
MathWorks.com

Thread Subject: Give me a Regression Problem

Subject: Give me a Regression Problem

From: Greg Heath

Date: 20 Jul, 2008 06:35:36

Message: 1 of 1

On Jul 19, 6:13=A0pm, baldrick <philbrier...@hotmail.com> wrote:
> On Jul 20, 6:22 am, Greg Heath <he...@alumni.brown.edu> wrote:
> > On Jul 19, 9:22 am, baldrick <philbrier...@hotmail.com> wrote:
> > > On Jul 19, 9:11 am, Greg Heath <he...@alumni.brown.edu> wrote:
> > > > On Jul 18, 1:28 pm, paulvbi...@gmail.com wrote:
> > > > -------SNIP
>
> > > > > I just reran the SVD PNN with 3 and 6 missing and with Zaknich
> > > > > weighting active and get R2=3D0.9129 which is close to the answer=
 with
> > > > > all the variables of R2=3D0.9217 (Zaknich weighting active) which=
 might
> > > > > just might indicate maybe all the variables should be included
>
> > > > I wouldn't say that (See below).
>
> > > > >but
> > > > > this is not clear to me. =A0Likely there are some subtle second o=
rder
> > > > > interaction effects here, minor for sure, but a little influence
>
> > > > Since my quadratic models obtained R^2 ~ 0.8, I think the effects
> > > > are higher order than 2nd.
>
> > > > > but as Greg says one may not be able to "prove" this also
>
> > > > I think the best you can do is reject or fail to reject a null
> > > > hypothesis based on bootstrapping and the probability
> > > > distribution for a separability measure.
>
> > > > However, as a practical engineer, I can live with
> > > > comparing overlaps of mean +/- 2*stdv confidence
> > > > intervals for MSE estimates obtained from 10-fold XVAL.
>
> > I guess I should look up the F-test and see how bad I am.
>
> > > Just been doing a bit more messing with theconcretedata and it
> > > appears that dryingconcretehas a bit of a half life. If you take the
> > > natural logarithm of the age and use this then you should be able to
> > > get an r^2 of 0.825 using linear regression. Should save you a few
> > > neurons.
>
> > Nice !
>
> > 1. The y-to-x8 correlation coefficient increases from 0.33 to 0.55
> > =A0 =A0and becomes the maximum.
> > 2. The average magnitude of the xi-to-x8 (i=3D1:7) correlation
> > coefficients =A0 =A0 =A0 =A0 =A0 changes from the minimum value of 0.13=
 to the
> > lower value of 0.06
>
> > 3. Linear regression Backward Elimination kept all variables resulting
> > in
> > =A0 =A0 R^2(adjusted) =3D 0.817. However, the pvalue for x5 was 0.065 .=
..
> > 4. Linear regression Forward Selection removed x5 (pvalue =3D 0.43) in
> > the =A0 =A0 =A0 =A0 =A0 =A0 selection sequence [ 1 2 3 8 5 4 (5) 7 6] =
=3D [ 1 2 3 8
> > 4 7 6] resulting
> > =A0 =A0in R^2 =3D 0.816 for p =3D 7. The final pvalue for x5 was 0.065.
>
> > 5. Quadratic regression with all p =3D 44 variables resulted in R^2 =3D
> > 0.882.
> > 6. Quadratic regression Backward Elimination kept all original
> > variables
> > =A0 =A0but removed 7 higher order terms: p =3D 37, R^2=3D 0.881.
> > 7. Quadratic regression Forward Selection selected 37 variables
> > (including
> > =A0 =A0the original 8). Then removed 4 higher order terms: p =3D 33, R^=
2 =3D
> > 0.880.
> > 8. Quadratic regression starting with the original 8 variables, added
> > 24
> > =A0 =A0higher order terms then removed 2 of them. p =3D 30, R^2 =3D 0.8=
78.
> > 9. Quadratic regression starting with the Paul's 6 variables, added
> > 26
> > =A0 =A0then removed 2 to obtain the same results as above. x3 was the 1=
st
> > =A0 =A0variable added and x6 was the 6th.
> > 10. Finally, target values were standardized. Plotting errors t-y vs
> > t,
> > =A0 =A0 10 - 12 points with abs(t-y) > ~1 are conspicuous ... Paul's
> > outliers?
>
> > ... tempted to try cubic ...
>
> > ... nah
>
> > Hmm ... how many neurons can be saved using log10(age)?
>
> if you plot the model errors ordered by the target value of strength
> then hopefully you will see what I do, that the model struggles with
> the higher strengths, underestimating them.

Yes, 7 of the 12 "outliers" are in this high-strength/under-
estimation
region. Although I found using continuous line plots in addition to
the usual scatter plots makes this characteristic more evident, the
sliding window average makes the point moe clearly. My first
impression
from my first Quadratic Regression results was that more high
strength
measurements are needed.

> here are the act v pred and a running average of the model error when
> the data is ordered by strength
>...http://www.philbrierley.com/phil/images/actpred.bmp
>http://www.philbrierley.com/phil/images/errtime.bmp

Similar to my results. What is the topology of your NN?

> Maybe we are missing some vital information such as the ambient
> temperature when the concrete is first layed. I'm no concrete expert
> but I remember on lifestyle shows people get jittery when they are
> laying their house foundations and it is very cold. I would think
> there would be an optimum initial temperature.

Yes, something seems to be missing. I can't think of anything more
plausible than temperature.

Nice work.

Greg


Tags for this Thread

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

rssFeed for this Thread

envelope graphic E-mail this page to a colleague

Public Submission Policy
NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content. Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available via MATLAB Central. Read the complete Disclaimer prior to use.
Related Topics