I am creating a hyper-parameter-optimized decision tree:
part = cvpartition(features.label, "KFold", 10);
opt = struct("CVPartition", part);
mytree = fitctree(features, 'label', 'MaxNumSplits', 10, 'OptimizeHyperparameters' , 'SplitCriterion', 'HyperparameterOptimizationOptions', opt);
So, as I understand, 10 folds are created. For each fold, 90% of the data is used to train a decision tree that is evaluated on the remaining 10% of the data. I have two questions:
Question 1: How is this 90/10 split created? Sequential entries from the feature matrix? Random entries from the feature matrix?
Quesiton 2: How are the 10 decision trees combined/merged to create the final decision tree?