Dealing with ugly data - Fields with number and string data types

1 view (last 30 days)
Hi,
Please see the attached UglyDataTable.xlsx. The first row has the names of the fields. In reality, my data table is nearly 300k rows.
1) My task is to sum up the values in FDBS_distribution_hours for a given FDBS_part. The first issue is that the FDBS_part field is "padded" out to 40 characters. The value I'm using to filter the table by FDBS_part is not padded - so, how do I remove the padding from the FDBS_part values in the table, OR, how do I add the padding to my filter variable?
2) Not all FDBS_part values are character strings. Some are numbers. How do I go about changing these values to 40-character-padded character strings?
I will likely have more questions on this for whomever responds. Thanks for the feedback!
Cheers!
JF
  2 Comments
Matt Kindig
Matt Kindig on 8 Apr 2014
My first thought is: can you export the data into a text format (tab-delimited, or better yet, comma-separated)? That would make manipulating the data much easier in MATLAB; plus the file will read into memory faster.
dpb
dpb on 8 Apr 2014
Edited: dpb on 8 Apr 2014
1) Just use xlsread to get the data -- it'll come back as a cell for the mixed columns so it'll already be text.
2) Length won't matter; just use a find operation to match substring rather than exact full length comparison or use strtrim if needed.
Just give it a go...it'll likely not be as daunting as you think. Start with
doc xlswread % how to read an xls file
help strfun % to see the overall string function list from which to choose

Sign in to comment.

Answers (0)

Categories

Find more on Characters and Strings in Help Center and File Exchange

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!