MATLAB Answers

How to show Sample Size at Each Split in Tree using fitctree?

3 views (last 30 days)
Justin
Justin on 29 Oct 2014
Commented: Justin on 31 Oct 2014
Am using fitctree, and of course, altering the MinLeaf size changes the tree output drastically, but also interested in seeing how the sample size shrinks as the tree progresses.
Know how?
thanks! Justin

Answers (1)

Siddharth Sundar
Siddharth Sundar on 31 Oct 2014
If I understand correctly, you want to be able to extract the subset of observations used at each split in a node.
The CutPredictor property ClassificationTree object is what you need.
tree.CutPredictor returns the names of the variables used in each node. You can use this along with the output of the CutPoint property (gives you the values used as cut points in the tree) to generate the subset of observations by using the conditions obtained from the above properties to index into the training data set.
  1 Comment
Justin
Justin on 31 Oct 2014
My clarity could have been better - What I am seeking is to know the number of observations at each node in the tree.
That is, say we started with a sample size of 1000 observations. It would be of interest to me know if the first node split that into 500 / 500 or 900 / 100; and so on for each node in the tree.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!