How to show only the same variable

Hello,
I have a .mat file as following
Name adress company
BOB london BIM
Alfred Paris BOB
John BOB CEF
I would like to display only the duplicate variable in or order to create this new .mat file
Name adress company
BOB
BOB
BOB
If someone have an idea to create the adapted code?
Thanks in advance

5 Comments

% Data
varnames = ["Name" "adress" "company"];
data = ["BOB" "london" "BIM"
"Alfred" "Paris" "BOB"
"John" "BOB" "CEF"];
T = array2table(data, 'VariableNames', varnames);
% Find the duplicate one
Tc = categorical(T{:,:}); % convert from string to categorical
Tcv = Tc(:); % make it into a long vector
dup = mode(Tcv); % duplicated entries (using mode)
% The location of duplicate one
idx = Tc == dup;
% Generate output
Tout = strings(size(data));
Tout(idx) = dup;
Tout = array2table(Tout, 'VariableNames', varnames)
Tout = 3×3 table
Name adress company _____ ______ _______ "BOB" "" "" "" "" "BOB" "" "BOB" ""
Thanks a lot,
I thought that I could get my solution using that kind of code
uc1 = unique( Data(:,:));
and this one errase only the duplicate in column not in row visibely
have a nice day.
Sorry For the late answer,
the duplication of yhe entries doesn' t work
dup = mode(Tcv); % duplicated entries (using mode)
It the same variables are randomly put in the table
I needed to find where they are...
You accepted Walter's answer, so we assume everything is working perfectly now.
I accepted walter's code
But chunru's code is more adapted to what I was looking for
walter's code is more complicated to adapt for the moment
so I focused on chunru's code and i saw that the duplication of the entries didn't work
dup = mode(Tcv); % duplicated entries (using mode)
this function doesn't seem to work
the same names (not variables) are randomly put in the table
I should keep the names which are the same and that appear in more than one variable.
Thanks for your understanding

Sign in to comment.

 Accepted Answer

common_values = intersect(intersect(Name, adress), company);
N = length(Name);
NName = strings(N, 1);
mask = ismember(Name, common_values);
NName(mask) = Name(mask);
Nadress = strings(N, 1);
mask = ismember(adress, common_values);
Nadress(mask) = adress(mask);
Ncompany = strings(N, 1);
mask = ismember(company, common_values);
Ncompany(mask) = company(mask);
output = table(Nname, Nadress, Ncompany, 'VariableNames', {'Name', 'adress', 'company'});

8 Comments

Thank you!
Its perfect for few variables but its very complicated for 50 variables...
Do you have a table() object? The code can be automated for a table object.
Question: what should be done for a name that appears only in most of the variables but not all of them? Should you keep only names that appear in all variables? Should you keep names that appear in more than one variable?
>>I should keep names which are the same that appear in more than one variable.
>>I don't know if its a table object unfortunately, It's a table which is read like that:
A = readtable('DataFilter.xlsx','TextType','string');
and where I exctract the Variables like that
VarNames = T.Properties.VariableNames;
when I look at the properties:
A.Properties
ans =
struct with fields:
Description: ''
UserData: []
DimensionNames: {'Row' 'Variables'}
VariableNames: {1×46 cell}
VariableDescriptions: {}
VariableUnits: {}
RowNames: {}
I hope it helps.
Thank you for your times!!
Untested as I do not have a sample of your data.
A = readtable('DataFilter.xlsx','TextType','string');
names_by_var = varfun(@unique, A, 'OutputFormat', 'cell');
[G, ID] = findgroups({names_by_var{:}});
counts = accumarray(G, 1);
names_with_dups = ID(counts>1);
is_dup = varfun(@(V) ismember(V, names_with_dups), A, 'OutputFormat', 'uniform');
Aarray = table2array(A);
B = strings(size(Aarray));
B(is_dup) = Aarray(is_dup);
I have an error when I use it
Error using findgroups (line 77)
A grouping variable must be a categorical, numeric, logical, datetime,
or duration vector, or a cell array of character vectors.
Error in [G, ID] = findgroups({names_by_var{:}});
The code should work with a table as described below
Name adress company
BOB london BIM
Alfred Paris BOB
John BOB CEF
and it should return
Name adress company
BOB
BOB
BOB
A = readtable('DataFilter.xlsx','TextType','string');
names_by_var = varfun(@unique, A, 'OutputFormat', 'cell');
[G, ID] = findgroups(categorical({names_by_var{:}}));
counts = accumarray(G, 1);
names_with_dups = ID(counts>1);
is_dup = varfun(@(V) ismember(categorical(V), names_with_dups), A, 'OutputFormat', 'uniform');
Aarray = table2array(A);
B = strings(size(Aarray));
B(is_dup) = Aarray(is_dup);
Error using categorical could not find unique values in DATA using the UNIQUE function.
Error in [G, ID] = findgroups(categorical({names_by_var{:}}));
Caused by:
Error using cell/unique Cell array input must be a cell array of character vectors.

Sign in to comment.

More Answers (0)

Categories

Find more on Data Import and Network Parameters in Help Center and File Exchange

Products

Release

R2016b

Asked:

Ali
on 11 Dec 2021

Commented:

Ali
on 12 Dec 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!