Splitting a table using varagin

I have a table named data. I want to split it into three different tables based on variables in the first column. I implemented the following lines of code which work perfectly. However, I don't understand what the varagin function is doing. The matlab documentation states that the first term in the bracket after splittaply should be a function such as @max. Is varagin acting as some sort of function? Secondly, why is it written twice (@varagin and the varagin again).
Group = findgroups(Data{:, 1});
Split = splitapply( @(varargin) varargin, Data , Group);

3 Comments

This is really clever. Mind sharing where you saw this idea? I'll add an explanation below as an answer.
Doron Joffe
Doron Joffe on 5 Dec 2021
Edited: Doron Joffe on 5 Dec 2021
Thank you very much. I would appreciate any help on this code. I saw the code at this link.
https://www.mathworks.com/matlabcentral/answers/440184-is-it-possible-to-split-a-table-into-multiple-tables-based-in-id-a-number-code-in-colum-a#answer_356786

Sign in to comment.

 Accepted Answer

Stephen23
Stephen23 on 5 Dec 2021
Edited: Stephen23 on 5 Dec 2021
"The matlab documentation states that the first term in the bracket after splittaply should be a function such as @max"
The SPLITAPPLY documentation actually states that the first input must be a function handle:
states that the first input as: "func — Function to apply to groups of data ... function handle ... Function to apply to groups of data, specified as a function handle."
Function handles are one of the fundamental data classes in MATLAB:
"Is varagin acting as some sort of function?"
No, VARARGIN is a special input argument in a function definition (aka signature) which collects any number of inputs into one cell array which is named varargin:
"Secondly, why is it written twice (@varagin and the varagin again)."
Because this code:
@(varargin)varargin
defines an anonymous function:
which uses VARARGIN to accept any number of inputs and stores them all in a cell array. Whilst it is certainly possible to write functions which totally ignore their inputs, if you want to use the inputs in some way then they need to be referred to within the function itself. So VARARGIN occurs as an input to the anonymous function and also when it is used (which in this example is simply for the cell array itself, rather then the more common cases of passing the data to another function call or something similar).
Why it works is explained in the source answer:

4 Comments

Thank you very much for the detailed answer. So in this example, would the inputs to varagin be the numbers 3,2,1 etc from the findgroup function?. These numbers are then fed to the anonymous function @varagin which creates a cell array. Is this the correct overview of what takes place?
@doron joffe: the grouping values are used by SPLITAPPLY to determine which values to call the function handle with. The function handle (in this case an anonymous function) is called by SPLITAPPLY using inputs (i.e. data from your table) that based on the grouping values, but does not get the grouping values themselves.
fnh = @(varargin)varargin;
out = fnh(1,2,pi,'cat')
out = 1×4 cell array
{[1]} {[2]} {[3.1416]} {'cat'}
> would the inputs to varargin be the numbers 3,2,1 etc from the findgroup function
No. In splitapply(@(varargin)varargin,Data,Group), Data are the inputs (see my answer). The grouping variable is applied within splitapply, not within the anonymous function.
Thank you for the excellent explanation.

Sign in to comment.

More Answers (1)

The function definition in splitapply can also be an anonymous function in the form @(vars)func(vars) where vars can be a single or multiple variables and func is a function. But in this case, the anonymous function takes the form @(x)x which means that the function merely returns the input. varargin is an array of inputs.
Examples:
fun = @(x)x;
fun(pi)
ans = 3.1416
fun2 = @(varargin)varargin;
fun2(1,2,'S')
ans = 1×3 cell array
{[1]} {[2]} {'S'}
splitapply splits variables within a table into groups and applies the specified function. Each variable (column) of a table is treated as a separate input variable so the splitapply function must have the same number of inputs as the number of table variables. varargin is a flexible way to accept all table variables without relying on knowing the table width.
The output for an nx3 table with 2 different groups would be a 2x3 cell array C where C{i,j} contains the values in table column j that correspond to group i.

1 Comment

thank you very much. This explains it perfectly.

Sign in to comment.

Categories

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!