Thread Subject:
Splitting Criterion in Decision Forests for Matlab

Subject: Splitting Criterion in Decision Forests for Matlab

From: Jay

Date: 1 Aug, 2013 12:00:14

Message: 1 of 3

Hi!

There is a thorough description of decision forests in this report
http://research.microsoft.com/apps/pubs/default.aspx?id=155552
and I am looking to use a framework like this in Matlab.

Matlabs Treebagger class in combination with the classregtree seems the way to go but I have two problems with this solution:
1. The splitcriterion (In the above report called the weak learner model) seems to be fixed. classregtree always aims to minimize a least square error in every node. It would be nice if I was able to write the weak learner model myself and to decide how to split.

2. What exactly is the predictor model used in classregtree, i.e., what is the function a leaf node uses to make a prediction, and how does treebagger handle the different outputs of its single trees? It would also be nice if I could change the predictor model used.


So, is there a way to do what I want Matlab (either with Treebagger or with another implementation)? I would highly appreciate not having to modify any C++ files or alike :)

Subject: Splitting Criterion in Decision Forests for Matlab

From: Alan_Weiss

Date: 1 Aug, 2013 12:44:10

Message: 2 of 3

On 8/1/2013 8:00 AM, Jay wrote:
> Hi!
>
> There is a thorough description of decision forests in this report
> http://research.microsoft.com/apps/pubs/default.aspx?id=155552
> and I am looking to use a framework like this in Matlab.
> Matlabs Treebagger class in combination with the classregtree seems
> the way to go but I have two problems with this solution:
> 1. The splitcriterion (In the above report called the weak learner
> model) seems to be fixed. classregtree always aims to minimize a least
> square error in every node. It would be nice if I was able to write
> the weak learner model myself and to decide how to split.
>
> 2. What exactly is the predictor model used in classregtree, i.e.,
> what is the function a leaf node uses to make a prediction, and how
> does treebagger handle the different outputs of its single trees? It
> would also be nice if I could change the predictor model used.
>
>
> So, is there a way to do what I want Matlab (either with Treebagger or
> with another implementation)? I would highly appreciate not having to
> modify any C++ files or alike :)

In fact, as of R2011a, Statistics Toolbox offers an ensemble learning
framework for both boosting and bagging that in most respects supersedes
classregtree and TreeBagger. In particular, it allows you to choose the
split criterion:
http://www.mathworks.com/help/stats/classificationtreeclass.html
See the 'SplitCriterion' name-value pair, and the Impurity and Node
Error definition.

The definition of prediction in this new framework is here:
http://www.mathworks.com/help/stats/compactclassificationtree.predict.html
http://www.mathworks.com/help/stats/compactclassificationensemble.predict.html

Alan Weiss
MATLAB mathematical toolbox documentation

Subject: Splitting Criterion in Decision Forests for Matlab

From: Ilya Narsky

Date: 1 Aug, 2013 13:55:14

Message: 3 of 3

"Jay " <aro-_-mail@web.de> wrote in message
news:ktdike$j23$1@newscl01ah.mathworks.com...
> Hi!
>
> There is a thorough description of decision forests in this report
> http://research.microsoft.com/apps/pubs/default.aspx?id=155552
> and I am looking to use a framework like this in Matlab.
> Matlabs Treebagger class in combination with the classregtree seems the
> way to go but I have two problems with this solution:
> 1. The splitcriterion (In the above report called the weak learner model)
> seems to be fixed. classregtree always aims to minimize a least square
> error in every node. It would be nice if I was able to write the weak
> learner model myself and to decide how to split.
>
> 2. What exactly is the predictor model used in classregtree, i.e., what is
> the function a leaf node uses to make a prediction, and how does
> treebagger handle the different outputs of its single trees? It would also
> be nice if I could change the predictor model used.
>
>
> So, is there a way to do what I want Matlab (either with Treebagger or
> with another implementation)? I would highly appreciate not having to
> modify any C++ files or alike :)
>

It's unlikely someone in this forum will want to read a ~140 page document
you are referring to. Without reading it, I'd like to clarify a few things.

"Weak learner" in the ensemble literature refers not to the split criterion
imposed by a decision tree, but to any data-fitting model model used
repeatedly for growing an ensemble of such models. The fitensemble function
provides several ensemble-learning algorithms. Some of them use decision
tree as the weak learner, and some of them use k-NN and discriminant.

The classregtree class, as well as the decision tree classes introduced
later (see Alan's response), support several split criteria for
classification and one (MSE) for regression. Even if you focus on
regression, decision tree is not fixed. An (perhaps, the most) important
tuning knob in ensemble learning by decision trees is the tree size. You can
control it by passing the 'minleaf' and 'minparent' parameters to TreeBagger
or to fitensemble (via a learner template).

At present, we do not support arbitrary weak learners for ensembles out of
the box. But if you want to code a weak learner model yourself, feel free to
get in touch with me, and I'll see what can be arranged.

Ilya

Tags for this Thread

Everyone's Tags:

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Tag Activity for This Thread
Tag Applied By Date/Time
machine learning Jay 1 Aug, 2013 12:04:12
random forest Jay 1 Aug, 2013 12:04:12
treebagger Jay 1 Aug, 2013 12:04:12
classregtree Jay 1 Aug, 2013 12:04:12
rssFeed for this Thread

Contact us