How do I import numerical entries from an excel sheet as strings?

15 views (last 30 days)
I have an excel sheet in which I want to extract a column of part numbers. The part numbers are not "numbers" as they contain multiple decimal points and symbols such as 04.613.514/516. I was hoping I could extract the part number column into a vector for later operations. Another issue is maintaining significant digits (for example, part number 324.070 becomes 324.07).
  4 Comments
dpb
dpb on 29 Aug 2018
Presuming you know which column and that is consistent, shouldn't be hard to do with the detectImportOptions spreadsheet object and setvartype then read with readtable instead of xlsread.
As Walter suggests, providing an input worksheet to look at would be invaluable.

Sign in to comment.

Accepted Answer

dpb
dpb on 29 Aug 2018
Edited: dpb on 29 Aug 2018
opt=detectImportOptions('testmatlabpartnumberreadin.xlsx'); % see what Matlab thinks
opt.DataRange='A1'; % start data w/ no header
opt.VariableNamesRange=''; % no variable names to read
opt.VariableNames={'ID'}; % set a useful name
opt=setvartype(opt,'ID','categorical'); % make it a categorical variable type
dat=readtable('testmatlabpartnumberreadin.xlsx',opt); % and read the spreadsheet to table
Show what we got...
>> dat
dat =
11×1 table
ID
_________
3.624.256
3.624.257
3.624.258
3.624.259
3.624.260
3.624.261
3.624.262
3.624.263
3.624.264
3.624.265
3.624.266
>>
Treating the ID/Part Number as categorical variable retains all text but is much easier for lookup or other operations than string-matching. It's exactly what they're there for... :)
And, the table is almost certainly going to be a much easier storage format for whatever your needs are than general cell array of combinations of various types of variables.
  1 Comment
dpb
dpb on 29 Aug 2018
Edited: dpb on 29 Aug 2018
I'm presuming the real spreadsheet will be much more complex; for something this trivial one could dispense with the full-blown import object with a couple of input switches on readtable and a fixup or two on the result but when things start to get complex the options structure is a way to consolidate things and oftentimes save quite a bit of cleanup otherwise needed.
It also has the benefit that you can create one import object and use it with any number of individual files that follow the same format; you don't have to remake it but once and you're done (until somebody messes up the spreadsheet on you, anyway! :) ).
There's one really, really annoying thing regarding detectImportOptions, however, is that for at least a spreadsheet such as this with only text in the data field, one cannot force it to not include the first line as variable names; even if one sets
opt=detectImportOptions('file.xls','Range','A1');
it still returns A1 as the variable name and A2 as the starting data location. It desperately needs the 'VariableNamesRange' and/or 'DataRange' field(s) to be able to be set to force proper interpretation and save having to fix up those later on.
Hmmmm....I don't know if readtable will override opt...
readtable('testmatlabpartnumberreadin.xlsx',opt,'ReadVariableNames',0)
ans =
10×1 table
x3_624_256
___________
'3.624.257'
'3.624.258'
....
so, No!. :( Looks like enhancement request time...

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!