Most efficient way of tackling this problem

1 view (last 30 days)
I have two variables (always 4 letter strings), Start and End. I need to generate other variables based on their values. *'s are wildcards (any capitalised letter).
The Start or End is classed as "Inner" if it is:
  • YF**
  • YG**
  • GRA*
  • EHEH
  • EHAM
The Start or End is classed as "Outer" if it is:
  • Y*** (except any inner values)
  • G*** (except any innter values)
  • BI**
  • BKPR
The Start or End is classed as "Far" if it is:
  • EGYP
  • Any other value
What's the most efficient way of doing this?
  5 Comments
Walter Roberson
Walter Roberson on 25 Aug 2015
Edited: Walter Roberson on 25 Aug 2015
If Far is EGYP or "any other value", then why specialize EGYP, why not just say "any other value", or more concisely "any value" ? Perhaps the "any other value" refers to "any value not classified as Inner or Outer" ? Still no need to say EGYP specially.
J
J on 27 Aug 2015
Good point, whoops! Yeah, EGYP was supposed to be YFYP, hence I modified your code slightly.

Sign in to comment.

Accepted Answer

Walter Roberson
Walter Roberson on 25 Aug 2015
Edited: Walter Roberson on 26 Aug 2015
var_is_Inner = ~cellfun(@isempty, regexp(Varlist, '^Y[FG]|^GRA|EHEH|EHAM'));
var_is_Outer = ~cellfun(@isempty, regexp(Varlist, '^Y[^FG]|^G[^R]|^GR[^A]|^BI|BKPR'));
var_is_Far = ~(var_is_Inner | var_is_Outer);
var_category = cell(length(Varlist), 1);
var_category(var_is_Inner) = {'Inner'};
var_category(var_is_Outer) = {'Outer'};
var_category(var_is_Far) = {'Far'};
Varlist should be a cell array of strings for this code, even if there is only one. Or if there is only one then replace length(Varlist) with 1 and Varlist could then be a single string.
  2 Comments
Walter Roberson
Walter Roberson on 26 Aug 2015
Repaired. I should not have been asking for 'match'
J
J on 27 Aug 2015
Perfect! Only modified it slightly for an additional variable I had forgotten.

Sign in to comment.

More Answers (1)

Cedric
Cedric on 26 Aug 2015
Edited: Cedric on 27 Aug 2015
Here is the beginning of an answer while I am waiting for my plane.
"Most efficient" can have multiple meanings here. Does it mean that you want the code to be fast, or that you want the code to be concise, or that you want the code to scale well to larger similar problems?
Let's say that if you want the code to be fast, you need to find a way to reformulate your problem so it can be vectorized (we can talk about this a little later). If you want it to be fast but you are still planning on looping over cases, and you can afford writing a very specific solution, you can solve the problem with a few nested IF statements, e.g.
function category = getCategory( bnd )
if bnd(1) == 'Y'
if bnd(2) == 'F' || bnd(2) == 'G'
category = 1 ;
else
category = 2 ;
end
elseif bnd(1) == 'G'
if all( bnd(2:3) == 'RA' )
category = 1 ;
else
category = 2 ;
end % EDITED
elseif all( bnd(1:2) == 'BI' ) || strcmp( bnd, 'BKPR' )
category = 2 ;
elseif any( strcmp( bnd, {'EHEH','EHAM'} ))
category = 1 ;
else
category = 3 ;
end
end
Using this, we get a category ID for both boundaries as follows
catId_start = getCategory( Data.Car.Start ) ;
catId_end = getCategory( Data.Car.End ) ;
which is easier and faster to test than strings like 'far' for example.
If you want it to be more versatile and/or more scalable, you can use pattern matching with REGEXP for example.
PS: I don't have a MATLAB for testing at the moment, but it shows the concept.
  4 Comments
J
J on 27 Aug 2015
Hmmmm, that's a good point. Okay, I guess if the destination categories were changing frequently it might be a good idea to use the REGEXP version. I'll reimplement your solution and compare them. Thanks again!!
Cedric
Cedric on 27 Aug 2015
Edited: Cedric on 27 Aug 2015
Now I would recommend to think well about what you really need to optimize, especially if the code is large and if other people will have to understand and update it later.
To illustrate my point, I would say that in most of my developments efficiency in terms of speed is only the 3rd or 4th priority. The top priorities are to make the code clear, stable, well documented, easily maintainable, etc. Only at a few strategic places, do I really favor speed. At these places, I compensate for the lack of clarity or for the complexity of the code with an extensive help, illustrative examples, etc.

Sign in to comment.

Categories

Find more on Characters and Strings in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!