MATLAB Answers

How to arrange misplaced data elements

10 views (last 30 days)
Temitope
Temitope on 21 Jan 2020
Edited: Guillaume on 22 Jan 2020 at 10:18
Hi all.
I have a data in excel format that has values under the wrong headers in different ways like;
Name Age Sex Weight Address Height
John M 20 House 2, collins street 120 55
This is not the data because i am not permitted to share it.
Is there a way the data can be arranged to put the values under the right headers using matlab?
Thank you for your help.

  2 Comments

Guillaume
Guillaume on 21 Jan 2020
Really, the best thing is to go back to whatever wrote these incorrect files and fix the issue at the source.
Trying to fix the issue in matlab may be possible but there's certainly no tool built-in to do that. You could possibly come up with some rules that would allow you to identify which entry should go where but you'd have to write these.
With your example, it should be easy to identify sex, it's either 'M' or 'F', you could possibly differentiate name and Address if you say that address contains numbers and text whereas name never contains a number. As for age, weight and height, it becomes a lot more ambiguous (particularly since we don't know the unit). Is 120 the height in cm, or the weight in kg or a very old person? Is 20 the weight in stone or the age? Is 55 a height of 5'5'' or the age or the weight in kg?
If you can come up with rules for each variable, we can help you write the code but I suspect there would always be some manual clean up required.
Temitope
Temitope on 22 Jan 2020 at 9:01
Hi Guillaume,
Thank you for your comment. The file is from a backup source because the original system crashed.
Let me see what i can do about the rules and get back.
Thanks.

Sign in to comment.

Answers (1)

Bhaskar R
Bhaskar R on 21 Jan 2020
Edited: Bhaskar R on 21 Jan 2020
Read your excel data as
T = readtable('< your excel file>'); % your file read in table data
Reinitialte the header to get what you what(shuffle Weight and Address header ) as
T.Properties.VariableNames = {'Name', 'Age','Sex', 'Address', 'Weight', 'Height'} ;
Write back your modified table to file
writetable(T, '<your file name>');

  2 Comments

Temitope
Temitope on 21 Jan 2020
Thank you Bhaskar, but the issue is there are about 1000 rows but the order of the misplacement is not the same. The value under the "Age" header for instance could be under "Address" for one row but under "Sex" for another row.
Image Analyst
Image Analyst on 21 Jan 2020
Attach the file. Chances are that it's a csv file with missing or extra delimiters. But we'd need to see it.

Sign in to comment.

Sign in to answer this question.