split training data and testing data

1,504 views (last 30 days)
Hello i have a 54000 x 10 matrix i want to split it 70% training and 30% testing whats the easiest way to do that ?
  1 Comment
Delvan Mjomba
Delvan Mjomba on 6 Jun 2019
Use the Randperm command to ensure random splitting. Its very easy.
for example:
if you have 150 items to split for training and testing proceed as below:
Indices=randperm(150);
Trainingset=<data file name>(indices(1:105),:);
Testingset=<data file name>(indices(106:end),:);

Sign in to comment.

Accepted Answer

Akira Agata
Akira Agata on 18 Jan 2018
Edited: the cyclist on 16 Aug 2022
I would recommend using cvpartition, like:
% Sample data (54000 x 10)
data = rand(54000,10);
% Cross varidation (train: 70%, test: 30%)
cv = cvpartition(size(data,1),'HoldOut',0.3);
idx = cv.test;
% Separate to training and test data
dataTrain = data(~idx,:);
dataTest = data(idx,:);
  9 Comments
Shehbaz Aslam
Shehbaz Aslam on 4 Sep 2021
I have 600001*4 data in Excel. While using this the data siplits into 70% training and 30% testing. But values in each column are changed after implementation of this function. For example I have third column of 40 values but when it generate training and testing data then values are automatically changed. Instead of 40 it becomes 0.2 or 0.3 or 0.4. why these values are changed?? Please help... The simple I want to divide 600001*4 data into training and testing data. I want to train and test ANFIS controller. Thanks

Sign in to comment.

More Answers (3)

Vrushal Shah
Vrushal Shah on 14 Mar 2019
If we want to Split the data set in Training and Testing Phase what is the best option to do that ?

Gilbert Temgoua
Gilbert Temgoua on 19 Apr 2022
Edited: Gilbert Temgoua on 20 Apr 2022
I find dividerand very straightforward, see below:
% randomly select indexes to split data into 70%
% training set, 0% validation set and 30% test set.
[train_idx, ~, test_idx] = dividerand(54000, 0.7, 0,
0.3);
% slice training data with train indexes
%(take training indexes in all 10 features)
x_train = x(train_idx, :);
% select test data
x_test = x(test_idx, :);
  1 Comment
uma
uma on 28 Apr 2022
how to split the data into trainx trainy testx testy format but both trainx trainy should have first dimension same also for testx testy should have first dimension same.Example i have a dataset 1000*9 . trainx should contain 1000*9, trainy should contain 1000*1, testx should contain 473*9 and texty should contain473*1.

Sign in to comment.


Jere Thayo
Jere Thayo on 28 Oct 2022
what if both training and testing are already in files, i.e X_train.mat, y_train.mat, x_test.mat and y_test.mat

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!