how to partition a dataset into 60% v 40%

2 views (last 30 days)
sam
sam on 9 Nov 2014
Commented: the cyclist on 13 Feb 2020
I have a diabaetes.mat file downloaded. I want to partition the data set into two groups 60% training set and 40% test set. I then want to rank the features.
I figure to rank the features i will use the function corrcoef but i have no idea how to partian the data set into 60% vs 40%.
cheers sam

Answers (1)

the cyclist
the cyclist on 9 Nov 2014
Depending on what toolboxes you have installed, there are a number of options:
cvpartition
randsample
randperm
The first two require the Statistics Toolbox, but that last one is in core MATLAB.
  2 Comments
Taiwo Kupoluyi
Taiwo Kupoluyi on 13 Feb 2020
Edited: Taiwo Kupoluyi on 13 Feb 2020
I guess the question to ask(for smeone new to Matlab) is what Toolbox do i need to have in order to be able to partition a dataset into Training and Holdout data.
Thank you in anticipation of your response.
the cyclist
the cyclist on 13 Feb 2020
You don't have to have any Toolbox to partition a dataset. You can partitiion a dataset into training and holdout using the randperm function (in base MATLAB) to randomly order the data, and then pick the first 80% (for example) for training.
But the cvpartition and randsample functions might make the job a little easier. Also, the Statistics and Machine Learning Toolbox is likely to have many other functions you might want to use for modeling.
This is kind of a general rule for toolboxes. You could write everything from scratch if you want to. Getting a toolbox is paying for the convenience (and rigor) of having MathWorks do it.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!