How can my code, which collects bad data with 'cellfun' and 'try, catch', be improved?

6 views (last 30 days)
My real situatoin is that I have a large number of files. Some I suspect are bad. I want to know the file names of all the bad files. So they will not go to the downstream workflow.
Here is a toy example I write to show the problem I have. I only want to keep the bad data. So from cellfun's output, I must remove the good data (coded as '0'). Is there any better way to do the whole thing? I appreciate any suggestion you have.
list = {'a', 'bc', 'defg'};
T = cellfun(@(x) func(x), list, 'UniformOutput',false)
Warning: --- error caused by bad data ---.\n
a MException with properties: identifier: 'MATLAB:badsubscript' message: 'Index exceeds the number of array elements. Index must not exceed 1.' cause: {} stack: [5×1 struct] Correction: []
Warning: --- error caused by bad data ---.\n
bc MException with properties: identifier: 'MATLAB:badsubscript' message: 'Index exceeds the number of array elements. Index must not exceed 2.' cause: {} stack: [5×1 struct] Correction: []
T = 1×3 cell array
{'a'} {'bc'} {[0]}
function out = func(x)
try
x(4); % This would create error for a character vecctor shorter than 4.
out = 0;
catch ME
warning("--- error caused by bad data ---.\n")
out = x; % This is to collect the bad data.
disp(x)
disp(ME)
end
end
  2 Comments
dpb
dpb on 11 Sep 2022
I'd revert to asking what defines a "bad" file vis a vis a "good" one...and what is the input format for the files?
Simon
Simon on 12 Sep 2022
I have lots of .html files. They will be further processed into Matlab tables. When they are procecesed with readtable(htmlfile), two of them, just detected by my codes, have the following error.
MException with properties:
identifier: 'MATLAB:io:common:xmlTree:ParseError'
message: 'Error in XML: Premature end of data in tag body line 71↵'

Sign in to comment.

Accepted Answer

Paul
Paul on 11 Sep 2022
Have func() return a logical with true indicating the file is bad and false indicating the file is good. In this case we'd have
list = {'a', 'bc', 'defg'};
T = cellfun(@(x) func(x), list)
Warning: --- error caused by bad data ---.\n
a MException with properties: identifier: 'MATLAB:badsubscript' message: 'Index exceeds the number of array elements. Index must not exceed 1.' cause: {} stack: [5×1 struct] Correction: []
Warning: --- error caused by bad data ---.\n
bc MException with properties: identifier: 'MATLAB:badsubscript' message: 'Index exceeds the number of array elements. Index must not exceed 2.' cause: {} stack: [5×1 struct] Correction: []
T = 1×3 logical array
1 1 0
% remove the good data
list(T)
ans = 1×2 cell array
{'a'} {'bc'}
The try/catch scheme may not be necessary depending on what the criteria really are for determing the goodness of a file and how those criteria can be tested.
function out = func(x)
out = false;
try
x(4); % This would create error for a character vecctor shorter than 4.
catch ME
warning("--- error caused by bad data ---.\n")
out = true; % This is to collect the bad data.
disp(x)
disp(ME)
end
end
  1 Comment
Simon
Simon on 12 Sep 2022
Edited: Simon on 12 Sep 2022
That's a wonderful clean solution. Even more beneficial is that I learn a good place to use local values. I am grateful for your help.
I want to accept your answer, but when I click 'Accept', I receive an error message, asking me to reload this page. I did that, and it still would not let me click 'Accept.' I will try again later.

Sign in to comment.

More Answers (0)

Categories

Find more on Startup and Shutdown in Help Center and File Exchange

Products


Release

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!