Use a function that returns multiple values as input argument to another function

44 views (last 30 days)
Hi,
I have a function CutOffForPctile as given below wherein the input argument "InData" is a column in a dataset/table and Pctile is a scalar value between 0 and 100. This i used to calculate the mean and standard deviation of data points till a specific percentile
function [CutOffMean, CutOffStd] = CutOffForPctile(InData, Pctile)
CutOff = prctile(InData,Pctile);
CutOffMean = mean(InData(InData<=CutOff));
CutOffStd = std(InData(InData<=CutOff));
end
I use the output of the above function as input in grpstats function like below. In the below C is a dataset and Im trying to find mean and std of data points till the 75th percentile within the defined group.
Cstats_Mod = grpstats(C,{'OptDesPt1', 'OptDesPt2'}...
,{'min','mean','max', @(C)CutOffForPctile(C,75),},'DataVars','DEffBasedOnMtlbFun');
My issue is that in the Cstats_Mod I have only one set of values (mean) from the function CutOffForPctile and it doesnt have standard deviation in the output.How can I make it return both
How can I return both mean and std as input argument to grpstats. Note, I dont want to make two seperate functions as I have a fairly big data set and finding the percentile cutoff is a costly process so would like to use it in one go for finding the mean and sd.
Any ideas?
Thanks Hari

Accepted Answer

dpb
dpb on 25 Feb 2015
Edited: dpb on 25 Feb 2015
As a reference, Matlab functions can only return a single output variable; the alternate return syntax is simply not supported.
Call the function and save the variables in another variable first before calling grpstats is the only option.
BTW, in your function, I'd compute a logical vector and use it instead of doing the test twice...
function [CutOffMean, CutOffStd] = CutOffForPctile(InData, Pctile)
CutOff = prctile(InData,Pctile);
idx=InData<=CutOff;
CutOffMean = mean(InData(idx));
CutOffStd = std(InData(idx));
end
Not sure if the JIT optimizer can find the common expression or not...
ADDENDUM
OK, I admit my lack of familiarity with grpstats led me to not fully consider the use of the function handle; I was thinking one could simply substitute a set of values; clearly that isn't so, agreed...
Checking with the documentation for grpstats, a valid function to use a function handle for additional statistics in grpstats must return either a column vector or an array of nvals-by- ncols. So, I think you need to rewrite your function as
function [CutOffStats] = CutOffForPctile(InData, Pctile)
CutOff = prctile(InData,Pctile);
idx=InData<=CutOff;
CutOffstats = [mean(InData(idx));
std(InData(idx))];
end
You'll need to ensure proper orientation of InData, of course, so that variables are by column and the statistics are row vectors.
  5 Comments
Hari
Hari on 27 Feb 2015
Thanks you for the detailed explanation.Very helpful.
I see your point regarding documentation. SAS was my best friend for more than a decade when I was working in the professional world and now after coming to academics for further studies I have to use Matlab. During the early years of learning SAS I used to have lot of enthusiasm for trying to learn the nooks and corners, but over the years the attention shifted away from tools/programming to analysis of data itself (Stats/OR).
My take on "newer graduates" is that, what were sub-fields or small areas at one point of time have now become stand-alone subjects and with rapid advancements happening more and more of that will happen in years to come; so a committed individual (who is ready to slog in) end up making choices on where/what to focus on considering their own strengths/aspirations etc. There are so many interesting things to learn that I keep negotiating with myself not to get side-tracked, still it is an on-going struggle.
Btw, I never thought that after working with SAS I would ever like any other software for data analysis..With Matlab I have been pleasantly surprised and this will definitely be a long term friend..
dpb
dpb on 27 Feb 2015
Edited: dpb on 27 Feb 2015
Indeed, I agree w/ the comments re: the field explosion; it's real, for sure. The key point I was trying to make in response to your comment on being surprised I gleaned the form need is that it's important to get to the end objective which is, I agree, the actual results of the analysis, not the code per se; in the end Matlab (or SAS or whatever) is just a glorified pencil/calculator. In order to do that expeditiously, one needs to learn to use all the facilities that are provided and these details in the documentation are some of the most key to not overlook (yet almost every question asked here has an answer obtainable by such study if the poster would really read and study such).(*)
Sounds like we may have had a lot in common; I started in the reactor engineering area; discovered statistics in the process of working with incore instrumentation systems and evolved into consulting with an emphasis towards utilizing probabilistic tools for engineering problems not amenable otherwise.
SAS was also a constant companion for years; you're fortunate to be coming to Matlab at the present time; until quite recently the features for such were quite weak in comparison. There are still issues with the integration of things into a comprehensive package but it's much improved, indeed. Matlab is somewhat more flexible for exploratory computational work; SAS still has some advantages for packaged standard data analyes in my view as it was, from the gir-go, intended for and implemented such whereas TMW has had to try to graft that on top of the general programming matrix language and try to keep the open nature as well. It's a tough mix to do well...
(*) Another sidebar-- :) I've complained over the years to TMW that particularly for the base language the documentation, while extensive, is lacking in that there is no definitive definition of the details of syntax but it is all written as narrative/by example instead. This does lead to areas in which there are "holes" or ambiguities or just oversights. I believe the idea was always to try to make it more accessible and therefore "easy" as compared to standard languages such as it's initial pattern, Fortran, and I understand the intent from that standpoint. But, I also think it has evolved over the years to be somewhat deliberate to retain flexibility and what is deemed proprietary knowledge from public release.

Sign in to comment.

More Answers (1)

Hari
Hari on 25 Feb 2015
Hi, Can you kindly provide an example of what you are proposing. I'm not able to make it work as I need to find the mean/std for specific levels of the grouping variable's value as indicated in my grpstats code.
I have almost 5000 unique grouping levels in my data set and not sure whether am required to do the saving process that you recommended for each of those levels in advance.
PS: thanks for the tip on index, idx.
Thanks Hari

Categories

Find more on Get Started with MATLAB in Help Center and File Exchange

Tags

No tags entered yet.

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!