Storing many digits using readtable

34 views (last 30 days)
Hi all,
I've got a question about storing long numbers using readtable.I have a csv file with comma data delimiter. Is it possible to store till 18 digits using scientific notation, using readtable or any other function? Or is it possible to cast to a certain number of digits (12, 15) using readtable? I've seen scientific notation allow till 15 digits, is there a way to force it?
Attached an example of a row of the csv file I've got to read from. As you can see, for example the fifth value is going to be shown with 15 digits in scientific notation (even if 18 digits are stored). Anyway, the last 3 digits (16,17,18) are going to be randomic in successfull processing.
Original value: -1298796679279255862
As it's going to be stored and visualized:
format long
-1.298796679279256e+18
The last 3 digits, instead of being "862", are going to be randomic.
Here's the function call:
S = readtable(rawCsvFile,'FileType','text');
Any help would be really appreciated.

Accepted Answer

Stephen23
Stephen23 on 3 Dec 2021
Edited: Stephen23 on 3 Dec 2021
Any advice that "you are going to need to read the file as text" is incorrect.
It is much better to import and store numeric data as numeric, if possible. And it really is very easy, because perfectly normal UINT64 and INT64 numeric types will correctly import all of the long integers in your example file (but of course you need to be aware of the limits to those number types, i.e. INTMAX and INTMIN).
opt = detectImportOptions('example.txt');
opt = setvartype(opt,'AE','int64');
tbl = readtable('example.txt',opt)
tbl = 1×30 table
AA AB AC AD AE AF AG AH AI AJ AK AL AM AN AO AP AQ AR AS AT AU AV AW AX AY AZ BA BB BC BD __________ __________ ___ ___ ____________________ __ ______ ___ ___ __ __ __ __ __________ __ ______ _____ _________ __ __________ __________ __________ ___ ___ ___ __ ___ __ ______ __________ 2.0182e+06 1.9457e+12 NaN NaN -1298796679279255862 0 5.2335 NaN NaN 0 5 0 47 2.9302e+14 13 41.805 605.1 0.0015561 17 1.1557e+06 0.00077804 1.5754e+09 NaN NaN NaN 0 NaN 1 48.498 1.5754e+09
Take a look at the variable AE (I added headers to your data file to make this example clearer), all of the digits are there and it is a perfectly normal numeric data type (no ugly text or symbolic). You can specify the other column types too.
  3 Comments
Stephen23
Stephen23 on 3 Dec 2021
Edited: Stephen23 on 3 Dec 2021
@Walter Roberson: which is why I already mentioned that restriction in my answer.
And if importing as text really is required (e.g. due to the range/number of digits) then we can still use exactly the same simple approach, with the benefit that all of the other data is still automatically, correctly, and efficiently imported as numeric/whatever:
opt = detectImportOptions('example.txt');
opt = setvartype(opt,'AE','string'); % string!
tbl = readtable('example.txt',opt)
tbl = 1×30 table
AA AB AC AD AE AF AG AH AI AJ AK AL AM AN AO AP AQ AR AS AT AU AV AW AX AY AZ BA BB BC BD __________ __________ ___ ___ ______________________ __ ______ ___ ___ __ __ __ __ __________ __ ______ _____ _________ __ __________ __________ __________ ___ ___ ___ __ ___ __ ______ __________ 2.0182e+06 1.9457e+12 NaN NaN "-1298796679279255862" 0 5.2335 NaN NaN 0 5 0 47 2.9302e+14 13 41.805 605.1 0.0015561 17 1.1557e+06 0.00077804 1.5754e+09 NaN NaN NaN 0 NaN 1 48.498 1.5754e+09
This also demonstrates that it is not required to import the file as text.
Marco Giangolini
Marco Giangolini on 14 Dec 2021
thank you very much! This one worked perfectly in my case!

Sign in to comment.

More Answers (1)

Walter Roberson
Walter Roberson on 3 Dec 2021
To preserve those digits, you are going to need to read the file as text and store the long numbers as either text or as symbolic numbers.
filename = 'https://www.mathworks.com/matlabcentral/answers/uploaded_files/822235/example.txt';
str = urlread(filename);
temp = regexp(str, ',', 'split');
S = nan(1,length(temp),'sym');
mask = strcmpi(temp, 'NaN') | cellfun(@isempty, temp);
S(~mask) = sym(temp(~mask));
S
S = 
If you look closely, you may notice an extra NaN at the end. The file ends in a comma, and for .csv files that means an empty field, so NaN has to be put in there.
This code will handle empty fields, and will also handle cases where the NaN appears as nan

Categories

Find more on Startup and Shutdown in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!