|
On Jul 16, 6:54=A0pm, baldrick <philbrier...@hotmail.com> wrote:
> On Jul 17, 5:12=A0am, Greg Heath <he...@alumni.brown.edu> wrote:
>
>
>
>
>
> > On Jul 15, 7:14 am, baldrick <philbrier...@hotmail.com> wrote:
> > -----SNIP
>
> > > An even quicker and not so dirty method to find the variable
> > > importance is to use the neural network model itself!
>
> > > Build a neural network model and then systematically randomise each o=
f
> > > the inputs in turn and see how much using a random value (from the
> > > same distribution) rather than the actual value destroys the model.
> > > Repeat this many times and take an average.
>
> > How many times did you repeat the randomizations?
> > How did you calculate the tabulated values for
> > =A0 a. scrambled correlation
> > =A0 b. relative importance
>
> > This procedure estimates the importance of each
> > predictor when all other variables are present. It is
> > the ranking at the first step of stepwise (and stage-
> > wise) backward elimination. In general, if correlations
> > between predictors are not insignificant, the rankings
> > above the last will change as the procedure is
> > continued; i.e., the last ranked variable is removed,
> > a new net is designed, and the randomization is
> > repeated to obtain the next lowest ranked variable.
>
> > In this sense, the technique does qualify as dirty.
>
> > However, as previously indicated by the dirtier, but
> > quicker, =A0linear and quadratic regression stagewise
> > backward elimination procedures, none of the variables
> > are insignificant.
>
> > Therefore, these rankings are credible.
>
> > Hope this helps.
>
> > Greg
>
> The randomizations were repeated 100 times for each variable being
> tested. It is no good doing it just once - the more the merrier. If
> you do it only a few times you will not get consistent results.
>
> The scrambled correlation is the new model r^2 when the variable in
> question is messed around with, or 'scrambled', averaged over the
> number of times the 'scrambling' is repeated.
>
> The relative importance is calculated by simple linear transformation
> based on the scrambled correlations, such that the variable whose
> scrambled correlation is lowest gets and importance of 1 and any
> variable whose scrambled correlation is the same as the normal model
> gets an importance of 0 (which means that it does not matter what
> value that variable has). It is possible to get negative importance
> which would mean randomizing that variable is actually improving the
> model!
>
> You have to drop statistical thinking to understand what this method
> is telling you. It is saying, 'if I use this current model, what will
> happen to the performance if one of my varibles goes belly up'.
>
> This is particularly important in areas such as credit risk. If you
> are using a field such as FICO score in your model, and FICO suddenly
> decide they are going to calculate it in a different way without
> telling you, or you loose the data feed, then you need to know what
> will happen to your model. Another example is personal income, which
> gradually increases over time - if your model is heavily reliant on
> income, then it will start to deteriorate quite quickly.
>
> There is also no reason why stongly correlated variables should not be
> used together. Historically this is to do with the maths behind
> finding the coefficients using certain techniques - inverting matrices
> and so on. Logically, I would rather have say, both income and bank
> balance in my model even if they were highly correlated. How do I know
> which one is the real driver of whatever is being predicted, and there
> is no reason why they should stay correlated (banks know your balance,
> but you could start lying about your income). =A0Having both in the
> model is kind of hedging your bets against things going wrong (you
> would want them to have similar importance though).
>
> Personally I use these importance calculations for initially trimming
> out the rubbish (as in the random numbers I put in the concrete data)
> and getting down to the variables of interest. It does save a lot of
> time. I have come accross model builders who have thousands of
> candidate variables and spend months inspecting each one in turn -
> only to end up getting rid of 95% of them.
This technique should be better than just clamping an input to it's
mean value.
Is this used in a backward elimination mode, i.e., toss out the
worst variable, design a new net and repeat?
Hope this helps.
Greg
|