Good reply Ilya, thank you kindly...
"Ilya Narsky" <inarsky@mathworks.com> wrote in message <jqlnpm$buu$1@newscl01ah.mathworks.com>...
> "Evan Ruzanski" <ruzanski@alumni.colostate.edu> wrote in message
> news:jqll5a$t8v$1@newscl01ah.mathworks.com...
> > Hello,
> >
> > I'm trying to replicate the results of TreeBagger (random forest of
> > regression trees) corresponding to one result generated within a nested
> > forloop. Because of the random nature of TreeBagger, this is not apparent
> > or easy to discover how to do this. Let me elaborate...
> >
> > I'm looping over two parameters in search of the "best" settings for a
> > particular application. I specify and initialize the random number stream
> > before the loop so it looks like this:
> >
> > ntrees = 10:10:300;
> > thresh = 50:10:150;
> >
> > %% LOAD X AND Y HERE %%
> >
> > ntrain = round(size(X,1)/2);
> >
> > RandStream.setGlobalStream(RandStream('mlfg6331_64','seed',29));
> > options = statset('UseParallel','never', 'Streams',...
> > RandStream.getGlobalStream,'UseSubStreams','never');
> >
> > for i = 1:length(ntrees)
> > for j = 1:length(thresh)
> >
> > tb =
> > TreeBagger(ntrees(i),X(1:ntrain,:),Y(1:ntrain),'method','regression',...
> > 'Options',options); ts =
> > predict(ts,X(ntrain+1:end,:));
> >
> > %% EVALUATE TS VS. Y(NTRAIN+1:END) USING THRESH(J) AS A PARAMETER
> > %%
> >
> > end
> > end
> >
> > %% SAVE EVALUATION RESULTS TO DISK %%
> >
> > The problem is to discover how to replicate a run, say ntrees = 100 and
> > thresh = 100, outside the loop. The results come out very differently if I
> > run through the loop and later select the result corresponding to ntrees =
> > 100 and thresh = 100 vs. running one instance of TreeBagger with ntrees =
> > 100 and thresh = 100.
> > The question is: How can I get these to be the same without going through
> > the entire loop?
> >
> > Thank you kindly...
> >
>
> The easiest thing to do would be to pass the RNG seed at the beginning of
> every iteration by executing, for instance
>
> rng(j+(i1)*length(thresh))
>
> Then you could execute rng with the same seed before running TreeBagger
> outside the loop.
>
> Assigning into the Substream property of the RandStream object at the
> beginning of each iteration would be a better option for randomness. I doubt
> this would matter in practice since TreeBagger with a few hundred trees
> generates a fairly small number of random numbers overall.
>
> Ilya
