Analyzing the occurrences of strings within cells using loops

1 view (last 30 days)
Good afternoon Matlab forums!
I've been working on a code to sort an array of data that includes both numbers and string values. Basically, within column 1 I have identification #'s, in column 2, I have strings, and in column 3 I have charges to be summed.
I am attempting to extract a histogram which shows the occurrences of the strings in column 2 based on particular conditions of the set. Basically, for each unique column 1 ID#(There are multiple occurences of the same here), I wish to check if Column 3 adds up to a particular threshold, in this case being 10,000. Here is a sample of my current code.
Q = cellfun(@str2num,SUB_ID,'UniformOutput',false);
D = cell2mat(Q);
concentated = [D C];
unique_ids = unique(concentated(:,1));
n_ids = length(unique_ids);
for i = 1:n_ids;
A_id = concentated(concentated(:,1) == unique_ids(i),:);
sum_id = sum(A_id(:,2));
if sum_id >= 10000
disp(unique_ids(i))
disp(sum_id)
end
end
The beginning of this code is me creating a 2 column matrix with the data in the array, excluding the previous "Column 2" data, as I can't seem to figure out how to make these loops work within in array as opposed to a matrix. Any help would be appreciated!!!
Thanks, J
  7 Comments
Walter Roberson
Walter Roberson on 13 Oct 2015
Right click and Save Link As, and then open the downloaded file.
dpb
dpb on 13 Oct 2015
As noted in the other thread, that worked w/ the earlier release; it's not working this AM after the upgrade--tried several incantations/variations with no success, unfortunately.

Sign in to comment.

Accepted Answer

Guillaume
Guillaume on 13 Oct 2015
I'm using a table here instead of cell arrays but the principle is the same. It's not clear if your SUB_IDs are always numerical or can also contain other symbols but in any case, you don't need to convert it to number. That's an unnecessary slow step. accumarray, ismember and some simple indexing will avoid any loop:
db = readtable('MatBookSample.xlsx'); %or read as cell array
[ids, ~, idlocations] = unique(db.SUB_ID); %get all unique id values and their location
idcharge = accumarray(idlocations, db.CHRG_AMT); %calculate charge amount per id, in the same order as ids
idovers = find(idcharge > 10000); %index of ids whose charge amount is above threshold
servovers = db.SERV_PROV(ismember(idlocations, idovers)); %corresponding serv_prov
To create the histogram you can use unique and histcounts:
[servs, ~, servasint] = unique(servovers);
servhist = histcounts(servasint, [1:numel(servs)+1]);
bar(servhist);
set(gca, 'XTickLabel', servs);
set(gca, 'XTickLabelRotation', 45);
  2 Comments
dpb
dpb on 13 Oct 2015
My thinking above Guillaume, just figured if could check it functioned as thunk it should on some actual data before posting it...
Julian Cruz
Julian Cruz on 14 Oct 2015
Edited: Julian Cruz on 14 Oct 2015
Thanks to both of you for the help! I really appreciate the community helping novices like myself better understand this software and its myriad capabilities!

Sign in to comment.

More Answers (0)

Categories

Find more on Startup and Shutdown in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!