Datetime with variable format
Show older comments
I have an output file that has timestamps in the form of 'yyyy-MM-dd HH:mm:ss.S' or 'yyyy-MM-dd HH:mm:ss'. I would like a clean way to convert them from date strings into an array of date time format.
Example code:
a={'2016-02-09 10:28:00';'2016-02-09 10:28:01.5'}
out=datetime(a,'InputFormat','yyyy-MM-dd HH:mm:ss.S','InputFormat','yyyy-MM-dd HH:mm:ss','Format','yyyy-MM-dd HH:mm:ss.S');
Actual output:
out =
2016-02-09 10:28:00.0
NaT
Desired output:
out =
2016-02-09 10:28:00.0
2016-02-09 10:28:00.5
Accepted Answer
More Answers (4)
Peter Perkins
on 18 Feb 2016
2 votes
Winston, I recommend that you NOT mix datenum and datetime. If you're in R2014b or later, don't use datenums. Among other things, because they count in units of days, using datenums runs the risk of round-off problems for sub-second timestamps.
You're right, datetime requires a single format. But it's pretty easy to get what you want.Convert using the first format. You'll get a vector of datetimes, with NaTs where the format was different. Then use isnat to convert those strings with the other format and assign into the appropriate elements.
Hope this helps.
I'm frankly surprised the above doesn't give a syntax error on the duplicated named input, but guess it likely just takes the first occurrence and doesn't count the number...will have to do some testing on that behavior; never thought of even trying it before.
Anyway, the doc says " All strings in DateStrings must have the same format." so you have only one shot at getting it right! :)
If you can't fixup the input format to become consistent, you'll have to fixup the data before converting--
>> a={'2016-02-09 10:28:00';'2016-02-09 10:28:01.5'};
>> ix=cellfun(@isempty,strfind(a,'.')) % find the locations missing the fractional seconds
ix =
1
0
>> a(ix)={[char(a(ix)) '.0']} % add the superfluous '.0' to make format match
a =
'2016-02-09 10:28:00.0'
'2016-02-09 10:28:01.5'
>>
Now you can convert with the fractional second format.
I thought you might get lucky if you skipped defining an input format string entirely but the requirement for all to be the same means that will also fail, undoubtedly.
Brendan Hamm
on 11 Feb 2016
Edited: Brendan Hamm
on 11 Feb 2016
Unfortunatelly you cannot specify multiple InputFormats simultaneously. Common to Name-Value pairs is that the last instance of a name value pair is used; in your example this is: 'yyyy-MM-dd HH:mm:ss'. One thing you can do to work around this issue is to convert the cell string to a vector of MATALB Serial Date Numbers and then to a datetime:
d = datenum(a);
d = datetime(d,'Format','yyyy-MM-dd HH:mm:ss.S','convertFrom','datenum');
I would caution you here as if you attempt to change the format in the line d = datenum(a) you will error due to a non-conforming entry.
12 Comments
Better answer -- didn't think to test it with datenum. I'll leave the other "just in case" and as it demonstrates a use of cellfun when searching cellstring arrays...and, of course, OP doesn't need to create the temporary d at all, simply wrap dateteime around datenum.
d = datetime(datenum(a),'Format','yyyy-MM-dd HH:mm:ss.S','convertFrom','datenum');
dpb
on 11 Feb 2016
Hmmm...I don't suppose datetime is clever enough w/ the specific format string to "do the right thing"? I don't have a recent-enough release to be able to test.
Brendan Hamm
on 12 Feb 2016
I believe this was an intentional construction that the Format has to be exact between all entries as this is the "standard" case. As such datetime called without a InputFormat will use the first instance to infer a format. However, if one is going to remove any ambiguity and specify their InputFormat, then performing any inference on the format is again leaving the function open to ambiguity, so I can see why one would expect it to operate as it does. I think really this just boils down to a special case.
In the datenum solution I provided, under the hood datevec is being called and the second portion of a datevec is allowed to have fractional components which is really why this method does allow a solution.
I will place this as an enhancement request.
dpb
on 12 Feb 2016
In datevec the logic fails to find a matching format pattern from the default list and so it reverts to parsing the individual tokens from the string without any format string at all, doing the conversion to time by counting the field placement and making an assumption regarding order of the date fields as to month/day. Since HH:MM:SS.SSS is a fixed order there's no ambiguity there but it doesn't again build any format string, it simply converts the field numeric characters and combines based on position. Thus the fraction gets built by computation from the full and partial seconds fields rather than converted by a format string.
This is why it's so slow when one doesn't specify a format string. I'd never looked at the guts before, had always presumed it did manage to find a pattern...but no.
Hannah
on 21 Sep 2017
Hi,
Has this issue been updated in more recent versions?
Thanks
dpb
on 21 Sep 2017
Dunno...lastest release I have is R2014b. As of then datevec was an m-file so could look at implementation; try
which datevec
on your release and see what it currently is...if still an m-file I'd guess it's likely it hasn't changed but you can go read the code to see specifically.
Peter Perkins
on 21 Sep 2017
I would stick with what I suggested on 18 Feb 2016 below. More explicitly:
>> a =
2×1 cell array
{'2016-02-09 10:28:00' }
{'2016-02-09 10:28:01.5'}
>> t = datetime(a,'InputFormat','yyyy-MM-dd HH:mm:ss.S')
t =
2×1 datetime array
NaT
09-Feb-2016 10:28:01
>> t(isnat(t)) = datetime(a(isnat(t)),'InputFormat','yyyy-MM-dd HH:mm:ss')
t =
2×1 datetime array
09-Feb-2016 10:28:00
09-Feb-2016 10:28:01
>> t.Format = 'yyyy-MM-dd HH:mm:ss.S'
t =
2×1 datetime array
2016-02-09 10:28:00.0
2016-02-09 10:28:01.5
Peter, I agree that's a good workaround for OP's problem if can't fix the input at the source.
What would you think of an enhancement to allow an array of conversion formats matching the input if known a priori?
Possibly a small-enough use case to not be worth the extra overhead but perhaps it could be implemented such that only a test for the optional inputs be required that wouldn't take much.
It could be something like a cellstr array of specific format strings and a corresponding vector of indices(*) to apply each or a explicit array of format for each record.
(*) or, alternatively, a repetition factor vectir whose sum matches total number of records.
Peter Perkins
on 24 Sep 2017
We're definitely looking at things like that for improving text-to-datetime import. Thanks for the suggestion.
For the time being one could build the answer using arrayfun or cellfun or the like...hadn't thought of that when OPs original question arose; I was fixated on the cleanup of the input to meet the requirements of existing function.
Reverting to previous sample inputs--
>> a={'2016-02-09 10:28:00';'2016-02-09 10:28:01.5'};
>> fmt={'yyyy-MM-dd HH:mm:ss';'yyyy-MM-dd HH:mm:ss.S'};
>> cellfun(@(t,f) datetime(t,'inputformat',f),a,fmt)
Error using cellfun
datetime output type is not supported. Set 'UniformOutput' to false.
Huh. There's another enhancement request... :)
>> dnm=cellfun(@(t,f) datetime(t,'inputformat',f),a,fmt,'uniformoutput',0);
>> dnm=[dnm{:}].';
>> dnm.Format=[dnm.Format '.S']
dnm =
09-Feb-2016 10:28:00.0
09-Feb-2016 10:28:01.5
>>
using the explicit format string per element. The index array should also work with slight modification of anonymous function...
ix=num2cell(~cellfun(@isempty,strfind(a,'.'))+1); % logical to indexing cell array
dnm=cellfun(@(t,i) datetime(t,'inputformat',fmt{i}),a,ix,'uniformoutput',0); % must dereference format cellstr array
does also work using the location of given format in array but only the unique format strings.
Peter Perkins
on 26 Sep 2017
>> version
ans =
'9.3.0.713579 (R2017b)'
>> a = {'2016-02-09 10:28:00.5';'2016-02-09 10:28:01'}
a =
2×1 cell array
{'2016-02-09 10:28:00.5'}
{'2016-02-09 10:28:01' }
>> fmt = {'yyyy-MM-dd HH:mm:ss.S';'yyyy-MM-dd HH:mm:ss'}
fmt =
2×1 cell array
{'yyyy-MM-dd HH:mm:ss.S'}
{'yyyy-MM-dd HH:mm:ss' }
>> cellfun(@(t,f) datetime(t,'format',f),a,fmt)
ans =
2×1 datetime array
2016-02-09 10:28:00.5
2016-02-09 10:28:01.0
Just to be clear, this does not allow you to provide multiple formats to the datetime constructor, but it does allow dpb's solution using cellfun to work.
Nice enhancement...I've noted in the past that datetime class is clearly "a work in progress" and while basic functionality and interaction is pretty good start there were/are still missing pieces. This just took care of one; not sure I'd actually run across it before, but it's a continuing step to fully integrate the facility.
Kudos! :)
PS: I note you carefully arranged to put the format with the .S format string first so the output format included it and didn't have to reset the display format... :)
Winston Black
on 12 Feb 2016
0 votes
Categories
Find more on Dates and Time in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!