How do I use 'textscan' to correctly read a formatted date?

42 views (last 30 days)
I am trying to use the %{frmt}D syntax in textscan with a date, but am unable to get it to work, and instead observe unexpected behavior.
I have the following code. First, I define a date:
>> sTest = '[11-11-2016 17:08:53.453]; test~';
I then try match the %{frmt}D syntax to the date provided in 'sTest', and have the rest of it match to %q:
>> out1= textscan(sTest, '[%{MM-dd-yyyy HH:mm:ss.SSS}D]%q', 'Delimiter',{'~'});
Error using textscan
Unable to read the DATETIME data with the format 'MM-dd-yyyy HH:mm:ss.SSS'. If
the data is not a time, use %q to get text data.
>> out2 = textscan(sTest, '[%{MM-dd-yyyy HH:mm:ss.SSS}D]%q', 'Delimiter',{']','~'})
out2 =
1×2 cell array
[11-11-2016 17:08:53.453] {0×1 cell}
>> out3 = textscan(sTest, '[%{MM-dd-yyyy HH:mm:ss.SSS}D%q', 'Delimiter',{']','~'})
out3 =
1×2 cell array
[11-11-2016 17:08:53.453] {1×1 cell}
However, I get unexpected results, as seen above. The first case produces an error, the second does not pick up the second portion of the input, and the third case works, but I do not understand why. Can you explain what is happening?

Accepted Answer

MathWorks Support Team
MathWorks Support Team on 24 May 2017
When textscan reads the input, it attempts to match the data to the format specified in the format specification, formatSpec, as detailed in the documentation. If textscan fails to match a data field, it stops reading and returns all fields read before the failure. The provided examples highlight some of the issues that may occur if the text input and format specification are not carefully matched.
1. The first case produces an error because the first character in the formatSpec, '[', is a literal that ignores the first character in 'sTest. As a result, the formatSpec in the %{frmt}D syntax tries to read everything starting with the second character in 'sTest' until the next delimiter. Because the delimiter is at the end, '%{MM-dd-yyyy HH:mm:ss.SSS}D' is trying to make a date out of '11-11-2016 17:08:53.453]; Hello\n'. You may verify this with the following command:
>> text = '[11-11-2016 17:08:53.453]; test~';
>> test1 = textscan(text, '[%q]%q', 'Delimiter',{'~'});
test1 =
1×2 cell array
{1×1 cell} {0×1 cell}
The second index of 'test' is empty because nothing appears after the delimiter.
2. In the next case, ']' is a delimiter, so the end of the first field is the last character before it, but the following ']' in the format specification is also expected to appear after the delimiter.
>> out2 = textscan(text, '[%{MM-dd-yyyy HH:mm:ss.SSS}D]%q', 'Delimiter',{']','~'})
out2 =
  1×2 cell array
    [11-11-2016 17:08:53.453]    {0×1 cell}
If the character following the date were ']]', this would not stop reading. This may be verified with the following commands:
>> mod_text = '[11-11-2016 17:08:53.453]]; test~';
>> test2 = textscan(mod_text, '[%{MM-dd-yyyy HH:mm:ss.SSS}D]%q', 'Delimiter',{']','\n'})
test2 =
  1×2 cell array
    [11-11-2016 17:08:53.453]    {1×1 cell}
When 'textscan' does not match a literal, it stops executing. You can see this by running the following:
test3 = textscan(text, '[%{MM-dd-yyyy HH:mm:ss.SSS}D]%q', 'Delimiter',{']','~'}, 'ReturnOnError', 0);
Error using textscan
Mismatch between file and format character vector.
Trouble reading 'Literal' field from file (row number 1, field number 3) ==> ;
test~\n
3. The final case works because the literal ']' in the format specification has been removed so that it may no longer match and the delimiter prevents it from being read by %{frmt}D:
>> out3 = textscan(text, '[%{MM-dd-yyyy HH:mm:ss.SSS}D%q', 'Delimiter',{']','~'})
out3 =
1×2 cell array
[11-11-2016 17:08:53.453] {1×1 cell}

More Answers (0)

Categories

Find more on Data Import and Export in Help Center and File Exchange

Tags

No tags entered yet.

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!