Code covered by the BSD License  

Highlights from
Random Forest

4.54545

4.5 | 15 ratings Rate this file 212 Downloads (last 30 days) File Size: 16.1 KB File ID: #31036

Random Forest

by

 

13 Apr 2011 (Updated )

Creates an ensemble of cart trees similar to the matlab TreeBagger class.

| Watch this File

File Information
Description

An alternative to the Matlab Treebagger class written in C++ and Matlab.

Creates an ensemble of cart trees (Random Forests). The code includes an implementation of cart trees which are
considerably faster to train than the matlab's classregtree.

Compiled and tested on 64-bit Ubuntu.

Acknowledgements

Getargs.M inspired this file.

MATLAB release MATLAB 7.9 (R2009b)
Tags for This File   Please login to tag files.
Please login to add a comment or rating.
Comments and Ratings (61)
31 Jan 2014 Hussein

Could anyone give an example of how to use this function, I mean the input parameters,.... I successfully built it, so if anyone could please advise. Thanks

18 Dec 2013 fairy  
15 Oct 2013 Fatemeh Saki

Hi everybody,
Does any one know how can I visualize the built tree after training?
Thanks

Fatemeh

16 Aug 2013 Gary Tsui

try
in GBCC.cpp line 3
#define log2(x) ( (1.0/log(2.0)) * log( (double)(x) ) ) // use double constants

that's what i did, can anyone else help to verify?

15 Jun 2013 Fatemeh Saki

Hi Leo,
I can not run the code. Error occurs during the compiling mx_compile file !!!
Would you please help me with that?

29 Dec 2012 fairy

I have found the reason.Because the Data mat is integral not double.

23 Dec 2012 fairy

Thanks Leo!

With your help,I have compiled succed !

by the way,the line 24 should change to saved_logs[j] = log((double)(j+1))/log(2.0);
.....

Thanks Leo again.

22 Dec 2012 Leo

Hi fairy,

lcc is not a cpp compiler. Using the Visual Studio compiler I think the following should do the trick.

in GBCC.cpp change line 24 to

saved_logs[j] = log(j+1)/log(2);

line 115 to

if (diff_labels[nl]>0) bh-=diff_labels[nl]*(log(diff_labels[nl])/log(2)-log(sum_W)/log(2));

line 151 to

if(diff_labels_l[nl]>0) ch-=(diff_labels_l[nl])*(log(diff_labels_l[nl])/log(2)-log(sum_l)/log(2));

and line 152 to

if(diff_labels_r[nl]>0) ch-=(diff_labels_r[nl])*(log(diff_labels_r[nl])/log(2)-log(sum_W-sum_l)/log(2));

Hope this solves it.

Leo

22 Dec 2012 fairy

Hi Leo

These are the erros and Warings when I compile 'mx_compile_cartree'
mx_compile_cartree
GBCC.cpp
GBCC.cpp(24) : error C2563: mismatch in formal parameter list
GBCC.cpp(24) : error C2568: '=' : unable to resolve function overload
C:\Program Files\Microsoft Visual Studio 9.0\VC\INCLUDE\math.h(567): could be 'long double log(long double)'
C:\Program Files\Microsoft Visual Studio 9.0\VC\INCLUDE\math.h(519): or 'float log(float)'
C:\Program Files\Microsoft Visual Studio 9.0\VC\INCLUDE\math.h(121): or 'double log(double)'
GBCC.cpp(24) : error C2143: syntax error : missing ';' before 'constant'
GBCC.cpp(24) : error C2064: term does not evaluate to a function taking 1 arguments
GBCC.cpp(25) : warning C4244: '=' : conversion from 'double' to 'int', possible loss of data
GBCC.cpp(43) : warning C4244: '=' : conversion from 'double' to 'int', possible loss of data
GBCC.cpp(108) : warning C4244: '=' : conversion from 'double' to 'int', possible loss of data
GBCC.cpp(115) : error C3861: 'log2': identifier not found
GBCC.cpp(115) : error C3861: 'log2': identifier not found
GBCC.cpp(127) : warning C4244: '=' : conversion from 'double' to 'int', possible loss of data
GBCC.cpp(151) : error C3861: 'log2': identifier not found
GBCC.cpp(151) : error C3861: 'log2': identifier not found
GBCC.cpp(152) : error C3861: 'log2': identifier not found
GBCC.cpp(152) : error C3861: 'log2': identifier not found

C:\PROGRA~1\MATLAB\R2012B\BIN\MEX.PL: Error: Compile of 'GBCC.cpp' failed.

Error using mex (line 206)
Unable to complete successfully.

Error in mx_compile_cartree (line 8)
mex -O best_cut_node.cpp GBCR.cpp GBCP.cpp GBCC.cpp
I use the version 2012(MATLAB) and VC++(2008) .

22 Dec 2012 fairy

Hi Leo

These are the erros and Warings when I compile 'mx_compile_cartree'

lcc preprocessor warning: .\node_cuts.h:8 best_cut_node.cpp:2 No newline at end of file
lcc preprocessor warning: best_cut_node.cpp:60 No newline at end of file
Error best_cut_node.cpp: .\node_cuts.h: 2 redeclaration of `GBCC' previously declared at .\node_cuts.h 1
Error best_cut_node.cpp: .\node_cuts.h: 5 redeclaration of `GBCR' previously declared at .\node_cuts.h 4
Error best_cut_node.cpp: .\node_cuts.h: 8 redeclaration of `GBCP' previously declared at .\node_cuts.h 7
Error best_cut_node.cpp: 35 type error in argument 5 to `GBCC'; found `int' expected `pointer to double'
Error best_cut_node.cpp: 35 type error in argument 7 to `GBCC'; found `pointer to double' expected `int'
Error best_cut_node.cpp: 35 insufficient number of arguments to `GBCC'
Error best_cut_node.cpp: 38 type error in argument 5 to `GBCP'; found `int' expected `pointer to double'
Error best_cut_node.cpp: 38 type error in argument 7 to `GBCP'; found `pointer to double' expected `int'
Error best_cut_node.cpp: 38 insufficient number of arguments to `GBCP'
Error best_cut_node.cpp: 41 type error in argument 5 to `GBCR'; found `int' expected `pointer to double'
Error best_cut_node.cpp: 41 type error in argument 6 to `GBCR'; found `pointer to double' expected `int'
Error best_cut_node.cpp: 41 insufficient number of arguments to `GBCR'
Error best_cut_node.cpp: 59 undeclared identifier `delete'
Error best_cut_node.cpp: 59 illegal expression
Error best_cut_node.cpp: 59 syntax error; found `method' expecting `]'
Error best_cut_node.cpp: 59 type error: pointer expected
Warning best_cut_node.cpp: 59 Statement has no effect
Error best_cut_node.cpp: 59 syntax error; found `method' expecting `;'
Warning best_cut_node.cpp: 59 Statement has no effect
Warning best_cut_node.cpp: 59 possible usage of delete before definition
17 errors, 5 warnings

C:\PROGRA~1\MATLAB\R2012B\BIN\MEX.PL: Error: Compile of 'best_cut_node.cpp' failed.

Error using mex (line 206)
Unable to complete successfully.

Error in mx_compile_cartree (line 8)
mex -O best_cut_node.cpp GBCR.cpp GBCP.cpp GBCC.cpp

I use the version 2012(MATLAB) and VC++(2008) .

22 Dec 2012 Leo

Hi fairy,

Could you copy paste the exact error message you get when running mx_compile_cartree.m ?

Leo

21 Dec 2012 fairy

Hi Leo,Thanks for your help,But now I have another erro!
??? Undefined function or method 'best_cut_node' for input arguments of type 'char'.

Error in ==> cartree at 84
[bestCutVar bestCutValue] = ...

Error in ==> Stochastic_Bosque at 48
Random_ForestT = cartree(Data(TDindx,:),Labels(TDindx), ...

It seems like mx_compile_cartree.m compiled failed. Exactly it was this commend:mex -O best_cut_node.cpp GBCR.cpp GBCP.cpp GBCC.cpp; failed.So why??Thanks again.

20 Dec 2012 zeel

how to use this function I means what are the parameters that I have to give to this function?

18 Dec 2012 Leo

Hi fairy,

It would seem that the function is not in matlab's search path. You can run

addpath(genpath(cd))

Leo

18 Dec 2012 fairy

leo
I paste the code so you can give me adives.Thank you.
load diabetes
Data = diabetes.x;
Labels = diabetes.y;
Random_Forest = Stochastic_Bosque(Data,Labels);

I get this error:
??? Undefined function or method 'cartree' for input arguments of type 'double'.

Error in ==> Stochastic_Bosque at 48
Random_ForestT = cartree(Data(TDindx,:),Labels(TDindx), ...
Could you kindly tell me why??Thanks very much.

13 Dec 2012 LE

Hi,Leo.I am afraid this package can not handle the categorical feature.So how could I update these code to handle these dataset with categorical features?
Kindly guide me.
Thanks.

11 Dec 2012 Leo

Hi qing,

Yes the elements of the vector "nodeCutVar" are feature indexes. You can retrieve the tree structure from the field RETree.childnode. For a node i the indexes of the child nodes are RETree.childnode(i) and RETree.childnode(i) + 1, for the left and right child.

Hopes this helps.

Leo

11 Dec 2012 qing

Hi leo,

Are the elements of the vector "nodeCutVar" feature indexes? But, how can I see the tree structure? I mean the relationship between features. Thanks!

30 Oct 2012 Linh Dang

Could anybody give some example how to run these file. Really appreciate.

03 Oct 2012 Marios  
26 Jul 2012 Michael

Quick, clean and easy to use.
A useful submission.

24 May 2012 mai

hai leo..
im just new in matlab and would like to explore more about random forest. but im not understand most of them.
function Random_Forest = Stochastic_Bosque(Data,Labels,varargin)
data is refer to my data.
what is for labels and varargin?

21 May 2012 Michael

Excellent work, code is well documented and clear, plus runtime is reasonable.

Adding a Readme file with description of the data format, and a demo.m would be very helpful.

Thanks for sharing.

11 Apr 2012 Leo

Hi Matteo,

Sorry for the late reply, did not receive a notification email.

Anyway you are correct, that is a bug in the code. It was pointed out by c. a few comments up.

The code has now been updated to remove that line.

Thanks for the feedback and rating.

23 Mar 2012 Matteo

Very good software, thank you for your effort!
I was wondering whether a replace with replacement is really implemented in this method as the documentation says.
When you make (lines 43-44):
TDindx = round(numel(Labels)*rand(n,1)+.5);
(NOTE: why not using 'randi' function?)
you get 'n' indexes and THEN you make:
TDindx = unique(TDindx);
removing all the duplicates (or more)!!!
Is it correct?

22 Mar 2012 Jeff

Hi Leo, based on your experience if this program is converted into pure C/C++, does that help improving the processing speed on PC?

06 Mar 2012 Enric Junque de Fortuny

I noticed that the f_output vector sometimes swaps dimensions in eval_Stochastic_Bosque(). Quickfix:

Add this at Line34 in eval_Stochastic_Bosque():
if (size(Data,1) ~= size(f_output,1))
f_output = f_output';
f_votes = f_votes';
end

23 Feb 2012 Leo

Hi C.

Thanks for pointing that out. I believe you are correct, that line should be commented out.

23 Feb 2012 c.

Hi,
why do you call
TDindx = unique(TDindx);
when creating the forest?
I was under the impression that the use of bagging would improve the generalization abilities of the model, but through the call of unique, we are getting rid of all multiple instances. Why did you chose to not use bagging, but rather use subsets of the original data?

15 Nov 2011 Ming

Impressive.
Hi all,
Some above said that the package failed to be complied in windows. I found that probably it is because in GBCC.cpp, log2() is not a C standard function. A feasible solution is to replace log2(N) with log((double)n)/log(double(2)).
Thanks again for the code sharing.

15 Nov 2011 Ming  
03 Nov 2011 Afzan

tq.

02 Nov 2011 Leo

Hi Afzan,

It is in :

/Stochastic_Bosque/cartree/mx_files

It's a C++ file

Leo

02 Nov 2011 Afzan

Leo.. where is the best_cut_node function(in cartree line 45)? I cant find it, or its compatibality problem again?

14 Sep 2011 Leo

Hi Kourosh,

Sorry for the late reply. Unfortunately I dont have a windows machine to try this out. I am suprised it wont compile under windows though. In Ubuntu it compiles using gcc.

Anyway if you paste the errors here maybe I can help out a bit more.

24 Aug 2011 Kourosh Khoshelham

Hi Leo, thanks for sharing this code. I have difficulty mexing the cpp files. Do i need a special compiler? when I get to this line:
mex -O best_cut_node.cpp GBCR.cpp GBCP.cpp GBCC.cpp
i receive so many errors in node_cuts.h and it finally says: Compile of 'best_cut_node.cpp' failed.
I'm using R2007b on win32.
could you help?
thanks,
kourosh

16 Jun 2011 Leo

Hey,

sorry, the update was pending approval. It should be ok now.

12 Jun 2011 AMB

Hi - the zip file that I can download now has the same creation date as the old - requires all the changes I made in order to run it, and performs the same as well - please advise

06 Jun 2011 Leo

Hey,

I have already updated the code. You can re-download it.

03 Jun 2011 AMB

Hi - Will you be posting the new code? - I would like to try it out on regression - Right now, when I use the old code on the classic Boston Housing data set, I get all NaNs. I would like to see if this problem disappears in the code you fixed.

03 Jun 2011 Leo

Hi Shujjat,

You would have to show me exactly what line 99 is in your code. In my code I do not get any errors. I suspect you have altered the code and inadvertently added or omitted a parenthesis.

Leo

03 Jun 2011 Shujjat

Hi Leo,
I am following your's and Mohammad's threads and I am getting following errors after doing your indicated amendments.
>> Random_Forest = Stochastic_Bosque(data,label);
??? Error: File: cartree.m Line: 99 Column: 27
Expression or statement is incorrect--possibly unbalanced (, {, or [.

Error in ==> Stochastic_Bosque at 45
Random_ForestT = cartree(Data(TDindx,:),Labels(TDindx), ...

Can you plz help me?
cheers

31 May 2011 Leo

Hey AMB,

Thanks a lot for all your help. I found the bug that was causing the difference in performance (accuracy-wise), a part of the code (erroneously) implicitly assumed that features values were distinct.

It is now fixed. And results on the Glass dataset are equivalent to the results you quote for the google code.

Regarding speed, the code seems to run considerably faster on my PC but nowhere near as fast as the google code, which is to be expected as the google code is written almost entirely in C/C++

I have also removed the dependencies from the statistics toolbox using your suggestions (thanks!).

Dependence on internal.stats.getargs has also been removed

31 May 2011 AMB

Hi - I have followed your suggestion to compare the results of your code versus the "google code". This google code is at

http://code.google.com/p/randomforest-matlab/

This is a Matlab (and Standalone application) port for the excellent machine learning algorithm `Random Forests' - By Leo Breiman et al. from the R-source by Andy Liaw et al. http://cran.r-project.org/web/packages/randomForest/index.html ( Fortran original by Leo Breiman and Adele Cutler, R port by Andy Liaw and Matthew Wiener.) Current code version is based on 4.5-29 from source of randomForest package by Abhishek Jaiantilal.

Against the "glass" data set here are the statistics for 10 and 100 trees, withholding the 35% of the data as you had done.

For RandomBosque, the results were:

Elapsed time for 1000 runs: 1648.743 seconds
Average number correct with 35% samples held out: 0.636 for 10 trees 0.684 for 100 trees
Standard deviation correct with 35% samples held out: 0.076 for 10 trees 0.070 for 100 trees

For class_RFtrain and classRFpredict, the results were:

Elapsed time for 1000 runs: 88.021 seconds
Average number correct with 35% samples held out: 0.722 for 10 trees 0.758 for 100 trees
Standard deviation correct with 35% samples held out: 0.051 for 10 trees 0.048 for 100 trees

I use a MACAIR with MATLAB 2011a and OS 10.6.7.
I was surprised at the runtime differences and the differences in the statistics.

My calls to Randombosque look as follows:

tic;
correct = zeros(1000,2);
for i = 1:length(correct);

M = length(Labels);
m = round(.65*M);
intraining = randperm(M);
intraining = sort(intraining(1:m));
notintraining = setdiff([1:M],intraining);

Random_Forest = Stochastic_Bosque(Data(intraining,:),Labels(:,intraining),'ntrees',10);
[f_output f_votes] = eval_Stochastic_Bosque(Data(notintraining,:),Random_Forest);
error = Labels(:,notintraining)'-f_output;
correctlyclassified = numel(find(error == 0))/numel(error);
correct(i,1) = correctlyclassified;

Random_Forest = Stochastic_Bosque(Data(intraining,:),Labels(:,intraining),'ntrees',100);
[f_output f_votes] = eval_Stochastic_Bosque(Data(notintraining,:),Random_Forest);
error = Labels(:,notintraining)'-f_output;
correctlyclassified = numel(find(error == 0))/numel(error);
correct(i,2) = correctlyclassified;

if rem(i,25) == 1
fprintf('Iteration: %3.0f\n',i);
end
end
toc;

fprintf('Elapsed time for %3.0f runs: %5.3f seconds\n',length(correct),toc)
fprintf('Average number correct with 35%% samples held out: %5.3f for 10 trees %5.3f for 100 trees \n',mean(correct));
fprintf('Standard deviation correct with 35%% samples held out: %5.3f for 10 trees %5.3f for 100 trees\n',std(correct));

22 May 2011 Leo

Waleed Hi,

Unfortunately it is quite hard to figure out what the problem is without more specific feedback. On top of this, the getargs function is not my code so I am not that familiar with how it works (or how it can fail).

Perhaps what you could do is remove that line of code all together and hard code the parameters. For example in the case of the cartree function make it :

function RETree = cartree(Data,Labels)

and then replace the call to getargs by :

minparent=2;
minleaf=1;
m=size(Data,2);
method= 'c';
W= [];

Alternatively you could make the call to cartree :

function RETree = cartree(Data,Labels,minparent,minleaf,m,method,W)

remove the call to getargs and just make sure you pass values for all the parameters whenever you call cartree.

If you want to look into more fancy options for passing parameters, you might find this thread useful :

http://stackoverflow.com/questions/2775263/how-to-deal-with-name-value-pairs-of-function-arguments-in-matlab

22 May 2011 Leo

Hi AMB,

Unfortunately I dont have the google code installed to compare, but I ran comparisons with matlab's TreeBagger (glass data, 140/74 split, 10 trees) and got similar results for the two methods (my code seems to give better results though I am not sure why).

22 May 2011 Leo

Hi AMB,

Thanks for all the feedback. You make some very good suggestions which I will try to incorporate soon, especially concerning the randsample dependency (which hadnt crossed my mind).

For the datasets you are testing on : I tested on Glass with 10 trees and got ~=72% accuracy on a 140/74 split. Could you report what splits, number of trees you are using and what accuracies you are getting with this code and the "google" code?

Thanks!

22 May 2011 AMB

I have been running my modified code and comparing the results with the version on

http://code.google.com/p/randomforest-matlab/

The results of the present package that I modified as above, against the google code do not agree well. I am using classical datasets such as glass (classification) and boston housing (regression), and the google code has a much higher degree of accuracy. I would be grateful if anyone could share their experience on using these classical data sets to see whether they see the same result in their implementations. The boston data set is at http://www.cs.toronto.edu/~delve/data/boston/bostonDetail.html and the glass data set is at http://archive.ics.uci.edu/ml/datasets/Glass+Identification

22 May 2011 AMB

This package was extremely useful. I should say to all that I am just a new student in this field and my comments reflect my interest in learning more, having a toolbox that is accessible, and one that actually works without days of effort. I must say that I have managed to get the other RandomForest implementations (Google code etc...) up and running but only with considerable difficulty owing to mex compilation issues. I did not have this particular difficulty with this package and as a result was delighted.

This package could be improved if it were accompanied by a demonstration file, some instruction on how to build the package and link the paths, and had eliminated the dependency on the randsample statistic toolbox routine, which some users do not have.

After modification of a few lines, the calls to randsample can be replaced, I believe. For instance the call in Random_Bosque:

TDindx = randsample(numel(Labels),n,true);

could I think be replaced with

TDindx = round(numel(Labels)*rand(n,1)+.5);
TDindx = unique(TDindx);

and the call in cartree

node_var = sort(randsample(M,m,0));

could be replaced with

node_var = randperm(M);
node_var = sort(node_var(1:m));

There may be limitations to using these substitutions when M is large, but I was very pleased with the speed of the entire package.

The author's suggestions to replace the internal.stats.getargs with calls to getargs were entirely successful.

On a MAC, the cpp programs mex'd without difficulty. I found it expedient to simply move the mx_eval_cartree.mexmaci64 and the best_cut_node.mexmaci64 and the weighted_hist.m files to the folder containing Stochastic_Bosque.m than to adjust paths.

I used the irisdata as a demonstration. It is short and uncomplicated. It is available from http://en.wikipedia.org/wiki/Iris_flower_data_set

Just copy the data out and place it into an mfile. I put the data into a matrix called Data. To try out the Stochastic Bosque routines, I then wrote

Labels = Data(:,5)';
Data = Data(:,1:4);

and then invoked the package by the calls:

Random_Forest = Stochastic_Bosque(Data,Labels,'ntrees',50);
[f_output f_votes]= eval_Stochastic_Bosque(Data,Random_Forest);
error = Labels'-f_output;
correctlyclassified = numel(find(error == 0))/numel(error)

As I am a beginner, and was operating without a license on the author's source code! I thought it useful to subsample the iris data set so that I would have a test set against which to examine the performance of the Random_Forest. While this was unnecessary from a theoretical standpoint, I thought it was worthwhile from the standpoint of checking that my modifications to the source were not ruinous.

The resulting test code looks like

M = length(Labels);
m = round(.5*M);
intraining = randperm(M);
intraining = sort(intraining(1:m));
notintraining = setdiff([1:M],intraining);
Random_Forest = Stochastic_Bosque(Data(intraining,:),Labels(:,intraining),'ntrees',10);
[f_output f_votes] = eval_Stochastic_Bosque(Data(notintraining,:),Random_Forest);
error = Labels(:,notintraining)'-f_output;
correctlyclassified = numel(find(error == 0))/numel(error)

and I was pleasantly pleased to see that the correctlyclassified measured compared favorably with the original

I should have also liked to see some proximity measures and permutation importance measures present, I speculate that perhaps these were eliminated to produce a package that ran swiftly. At any rate, I shall try to make these myself, because it seems to me that I can write a wrapper and call the Stochastic_Bosque to make my own calculations. If the author would care to offer any further suggestions or caveats, I would like to hear them because I think that his work is useful and can be extended.

21 May 2011 Waleed Yousef

Thanks, but what about the error message

21 May 2011 Leo

Hi,

The line 45 you refer to has to do with the subsampling of data samples not the features. Each tree is trained using a different subset of the training data.

21 May 2011 Waleed Yousef

I received the same errors above, as Mohammed. I corrected them as you advised. I receive now this error:

Error in ==> getargs at 48
emsg = '';

??? Output argument "varargout{7}" (and maybe others) not assigned during call to "C:\MyDocuments\MATLAB\tmp\getargs.m>getargs".

20 May 2011 Waleed Yousef

So, what is line 45 in Stochastic_Bosque where you say: cartree(Data(TDindx,:), ...

Doesn't this mean that you enforce a subset of the features on the whole tree.

20 May 2011 Leo

Hi,

Random feature selection for the cartrees is done in line 74 :

node_var=sort(randsample(M,m,0));

which is inside the tree construction loop. So it is done separately for each node. Is this what you were referring to or was it another line of code?

20 May 2011 Waleed Yousef

Leo, I just skimmed your code. I think you do random selection of features for a tree not for each node in the tree as it should be. Am I right?

09 May 2011 Leo

Hi Mohammad,

It seems to be another version incompatability.

Replace :

[unique_labels,~,Labels]= unique(Labels);

with

[unique_labels,dummy,Labels]= unique(Labels);

and it should work.

Leo

09 May 2011 Mohammad Ali Bagheri

Thanks Leo

I got another error using your codes:

??? Error: File: cartree.m Line: 50 Column: 25
Expression or statement is incorrect--possibly unbalanced (, {, or [.

Error in ==> Stochastic_Bosque at 46
Random_ForestT = cartree(Data(TDindx,:),Labels(TDindx), ...

The 50th line is:
[unique_labels,~,Labels]= unique(Labels);
It seems odd; at least for me.

Besides, I wanna know that your code is based on Random subspace method? If so, how many percent of features is used to create feature subsets?

09 May 2011 Leo

Default number of features sampled at each node is

round(sqrt(size(Data,2)))

where size(Data,2) is the dimensionality of the data.

You can set this parameter via the
'nvartosample' parameter.

09 May 2011 Mohammad Ali Bagheri

Thanks Leo

I got another error using your codes:

??? Error: File: cartree.m Line: 50 Column: 25
Expression or statement is incorrect--possibly unbalanced (, {, or [.

Error in ==> Stochastic_Bosque at 46
Random_ForestT = cartree(Data(TDindx,:),Labels(TDindx), ...

The 50th line is:
[unique_labels,~,Labels]= unique(Labels);
It seems odd; at least for me.

Besides, I wanna know that your code is based on Random subspace method? If so, how many percent of features is used to create feature subsets?

03 May 2011 Leo

Hi Mohammed,

internal.stats.getargs is an internal Matlab command which I assume is not available in your version. You can download the following :

http://www.mathworks.com/matlabcentral/fileexchange/24082-getargs-m

and simply replace that line by :

[eid,emsg,minparent,minleaf,m,nTrees,n,method,oobe,W] =
getargs(okargs,defaults,varargin{:});

(and similarly in the cartree function :

[eid,emsg,minparent,minleaf,m,method,W] = getargs(okargs,defaults,varargin{:});

)

28 Apr 2011 Mohammad Ali Bagheri

Hi Leo

When I run this command:
Random_Forest = Stochastic_Bosque(Patterns,Targets);

I get this error:

??? Undefined variable "internal" or class "internal.stats.getargs".

Error in ==> Stochastic_Bosque at 39
[eid,emsg,minparent,minleaf,m,nTrees,n,method,oobe,W] =
internal.stats.getargs(okargs,defaults,varargin{:});

Why?!!

Updates
14 Jun 2011

Removed implicit assumption of distinct feature values, removed statistical toolbox dependency, removed internal command dependency

11 Apr 2012

Fixed bug (see comment by c.)

Contact us