Function gives ans=0 when comparing text

11 views (last 30 days)
Chriss
Chriss on 23 May 2017
Edited: dpb on 28 May 2017
Bad title i know, but here is the problem: I have made a function that should take a social security number as an input and then return the persons name and income and taxes paid. When i use and input it returns the answer 0. The files are imported from excel and made into cell, so i guess the problem lies within the strcmp. what can i use when i compare numbers?
global Citizens Income Taxes
Citizens=[];
Income=[];
Taxes=[]
%Importing:
[~,~,Citizens]=xlsread('Citizens.xlsx');
[~,~,Income]=xlsread('Income.xlsx');
[~,~,Taxes]=xlsread('Taxes.xlsx');
Function file:
function [Name,personalIncome,taxespaid]=socialsecuritylook(serialnumber)
global Citizens;
global Income;
global Taxes;
Name=0;
personaliIncome=0;
taxespaid=0;
serialnumber=0;
for i=2:size(Citizens,1)
if strcmp(Citizens{i,1},serialnumber);
Name = Citizens{i,2};
end
end
for i=2:size(Income,1);
if strcmp(Income{i,1},serialnumber)
personalIncome = {i,2}
end
end
end
original excel files:
Citizens file:
Security number Name Gender Region
103920482 Hans Kale M 10
492012903 Sara Green F 2
Income file:
Security number Income
103920482 72000
492012903 54350
Taxes file:
Region basic tax tax limit higher rate tax
10 25 50000 45
2 20 40000 50
thanks in advance
  4 Comments
Chriss
Chriss on 24 May 2017
Edited: Chriss on 24 May 2017
I have added more to the text, hope it helps. I use the global command as i should be able to change taxes in another function and add new people to the regions. I know strcmp doesn't work but haven't found another way yet (have also tried if serialnumber==Citizens(:,1)).
dpb
dpb on 24 May 2017
The "data storage structure" in question is how you've read it in, specifically. Oh, I see you did include the xlsread call. That with the file, does give enough info to know what Matlab will do with it...
The SSN lookup is then
>> SSN=103920482; % pick a matching value so will not fail...
>> find(cell2mat(citz(2:end,1))==103920482)
ans =
1
>>
Even though you read the raw input, xlsread converts the values in the cells to whatever is its internal form--the SSN values are integers so they're a cell array, the content of which (excepting for the header that's of no interest) is double. Thus the string searches are doomed to failure.
I show a "more better" way using the table class below.

Sign in to comment.

Accepted Answer

dpb
dpb on 24 May 2017
Edited: dpb on 25 May 2017
Read the data files as tables and use categorical variables; will make things downstream much simpler.
>> inc=readtable('incomes.dat');
>> inc.SSN=categorical(inc.SSN)
inc =
SSN Income
_________ ______
103920482 72000
492012903 54350
>> cit=readtable('citizens.dat');
>> cit.Name=categorical(cit.Name);
>> cit.Gender=categorical(cit.Gender)
cit =
SSN Name Gender Region
_________ __________ ______ ______
103920482 Hans Kale M 10
492012903 Sara Green F 2
>> tax=readtable('taxes.dat')
tax =
Region Basic Limit High
______ _____ _____ ____
10 25 50000 45
2 20 40000 50
>>
Then you can write something like--
function [name,inc,taxes]=SSNlookup(SSN,CIT,TAX,INC)
name=CIT.Name(CIT.SSN==SSN);
inc=INC.Income(INC.SSN==SSN);
taxes=figuretax(inc,CIT.Region(CIT.SSN==SSN),TAX);
end
function tax=figuretax(inc,region,TAX)
taxTable=TAX(TAX.Region==region,2:end);
tax=min(inc,taxTable.Limit)*taxTable.Basic/100;
if inc>taxTable.Limit
tax=tax+rem(inc,taxTable.Limit)*taxTable.High/100;
end
end
which can be used as--
>> ssn=input('SSN, please? ','s'); % input SSN as text
SSN, please? 103920482
>> [n,i,t]=SSNlookup(ssn,cit,tax,inc)
n =
Hans Kale
i =
72000
t =
22400
>>
ADDENDUM/ERRATUM
Fixed edit/copy of lookup for income to use INC table; had copied previous line for name and changed it but not CIT-->INC initially.
Also oversight of not using string argument 's' for the input SSN; categorical variable matching works directly on the string so don't need to cast the numerical value as had initially.
END ADDENDUM now back to regularly-scheduled programming
The key things to note are:
  1. No globals in sight; can only update the tables when not using them anyway,
  2. The lookup is handled by matching in the tables in very readable fashion
  2 Comments
Stephen23
Stephen23 on 26 May 2017
+1 for a neat and robust solution
dpb
dpb on 26 May 2017
Thanks...I've come to really like categorical and the table is really good at simplifying lookup operations in conjunction with it.
I can tell if one had really, really large databases it may have some performance issues, but most cases are probably within its grasp; at least until shown otherwise.

Sign in to comment.

More Answers (2)

the cyclist
the cyclist on 24 May 2017
MATLAB variable names are case-sensitive. I notice you have a variable called income, and another called Income. (This is super-confusing.) It looks like you set the value of income to zero, never change that, and then output it.
  1 Comment
Chriss
Chriss on 24 May 2017
The Income is my file, income is the persons income i am trying to display. i could also call that one personalIncome

Sign in to comment.


Chriss
Chriss on 25 May 2017
I found out the mistake. The security number wasn't saved as a number but as a text. The only thing wrong was the if statement where {} should be ().
for i=2:size(Citizens,1)
if strcmp(Citizens(i,1),serialnumber);
Name = Citizens{i,2};
end
end
  5 Comments
Chriss
Chriss on 27 May 2017
Edited: dpb on 28 May 2017
Can you perhaps tell my why this function gives the answer "Wrong" and not "Right"?
v=2:length(Taxes);
for i=2:length(Citizens)
if strcmp(Citizens(i,1),serialnumber)
region = mat2str(Citizens{i,4})
if strcmp(Taxes(v,1),region)
disp('Right')
else
disp('Wrong')
given my Citizens file has the following format:
'Serialnumer' 'name' 'm/f' 'Region'
'154305205' 'Sara' 'f' 5
'195934023' 'Peter' 'm' 2
and my Taxes file has following format:
'Region' 'Basic' 'Limit' 'High'
'1' [27.00] [55000.00] [43.00]
'2' [22.00] [55000.00] [43.00]
dpb
dpb on 27 May 2017
Edited: dpb on 28 May 2017
Yes, but solving it that way is counter-productive.
Convert the data to a consistent format; if you refuse to use the table class which would be ideal for the purpose here, at least convert the numeric values stored as strings to numerics and then collapse the table into a single array without the header incorporated into the array. Then a simple lookup works altho still the categorical for the region in the TAXES table and the other key variables in the other two would be "the cat's meow!" for use as the id variable.
As illustration of what talking of, using the previous data for the TAXES table I've already got still hanging around in memory...
>> Taxes=cell2mat(table2cell(tax)) % convert to the aforementioned array
Taxes =
10 25 50000 45
2 20 40000 50
>> rgn=2; % the wanted region from a previous operation
>> Taxes(Taxes(:,1)==rgn,:) % retrieve that region's data
ans =
2 20 40000 50
>>
Again, no loops in sight nor needed, and certainly not nested loops that will be impossible to expand for further functionality as the job progresses.
This still isn't nearly as easy to write code for or maintain as one has to remember which column of the array holds which variable and use 2D array indexing to retrieve the desired quantity. A table (did I mention the table class yet?) lets you refer to each by name or several by a list of names or using the subscript expression in places where that really is the most efficient.
ADDENDUM
BTW, it takes some effort to "get there from here" having read the raw data originally (which, coincidentally, is the root source of much of your trouble in having to deal with the mishmash of various data types for the same thing and the cell array) -- I loaded your above into cell array t; use your current Taxes array there:
>> Taxes=[cellfun(@str2num,t(2:end,1)) cell2mat(t(2:end,2:end))];
Taxes =
1 27 55000 43
2 22 55000 43
>>
and voila!! a much easier dataset to deal with. If you'd used the first num return from xlsread from the git-go, that's the result would have had initially it would appear (altho we've not seen the actual spreadsheet but the presumption is pretty solid based on raw content shown).
And, to show it really isn't difficult (in fact, one might say it's even simpler than the above if just read the doc a little and follow a few examples), to convert the above cell array to the table (didn't I mention the table class yet? :) ), starting with the same t here:
>> Taxes=cell2table(t(2:end,:)); % data to table from cell
>> Taxes.Properties.VariableNames=t(1,:); % set the variable names
>> Taxes.Region=categorical(Taxes.Region) % convert region from cellstr
Taxes =
Region Basic Limit High
______ _____ _____ ____
1 27 55000 43
2 22 55000 43
>>
With either of the above, all your searching problems, associated loops and if's just melt away into the night. As the original example shows, particularly the latter really documents the solution in code as well.

Sign in to comment.

Categories

Find more on Tables in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!