File Exchange

image thumbnail

DateStr2Num

version 1.4.0.0 (14.3 KB) by Jan
Convert date string to date number - C-Mex: much faster than DATENUM

10 Downloads

Updated 14 Jun 2018

View License

DATESTR2NUM - Fast conversion of DATESTR to DATENUM
The builtin DATENUM command is very powerful, but if the input is known to be valid and formatted exactly, a specific MEX can be much faster:
For single strings DateStr2Num is about 120 times faster than DATENUM, for a {1 x 10000} cell string, the speed up factor is 300 to 600(!), (Matlab 2011b/64, MSVC 2008).
D = DateStr2Num(S, F)
INPUT:
S: String or cell string in DATESTR(F) format.
In opposite to DATENUM the validity of the input string is *not* checked.
F: Integer number defining the input format. Accepted:
0: 'dd-mmm-yyyy HH:MM:SS' 01-Mar-2000 15:45:17
1: 'dd-mmm-yyyy' 01-Mar-2000
29: 'yyyy-mm-dd' 2000-03-01
30: 'yyyymmddTHHMMSS' 20000301T154517
31: 'yyyy-mm-dd HH:MM:SS' 2000-03-01 15:45:17
230: 'mm/dd/yyyyHH:MM:SS' 12/24/201515:45:17
231: 'mm/dd/yyyy HH:MM:SS' 12/24/2015 15:45:17
240: 'dd/mm/yyyyHH:MM:SS' 24/12/201515:45:17
241: 'dd/mm/yyyy HH:MM:SS' 24/12/2015 15:45:17
1000: 'dd-mmm-yyyy HH:MM:SS.FFF' 01-Mar-2000 15:45:17.123
1030: 'yyyymmddTHHMMSS.FFF' 20000301T154517.123
OUTPUT:
D: Serial date number.
EXAMPLE:
C = {'2010-06-29 21:59:13', '2010-06-29 21:59:13'};
D = DateStr2Num(C, 31)
>> [734318.916122685, 734318.916122685]
Equivalent Matlab command:
D = datenum(C, 'yyyy-mm-dd HH:MM:SS')
The C-file must be compiled before using. This is done automatically at the first call of this function.
Pre-compiled Mex files can be downloaded from: http://www.n-simon.de/mex

Tested: Matlab 6.5, 7.7, 7.8, 7.13, 32/64bit, WinXP/7
Compiler: LCC 2.4/3.8, BCC 5.5, Open Watcom 1.8, MSVC 2008
Compatibility to MacOS, Linux, 64 bit is assumed, but not tested.

See also: DateConvert (Jan Simon)
http://www.mathworks.com/matlabcentral/fileexchange/25594

Cite As

Jan (2019). DateStr2Num (https://www.mathworks.com/matlabcentral/fileexchange/28093-datestr2num), MATLAB Central File Exchange. Retrieved .

Comments and Ratings (28)

This should be included in the Statistics and ML toolbox

A neat concept neatly implemented. On the occasions when I have needed to process some large collections of data this submission was most useful: thank you Jan Simon! Does exactly what it says on the box :)

Jan

@Aaron Schurger: Now month names are accepted in uppercase also.

One minor glitch: I get an error when the name of the month is in all caps, as in
'03-JAN-2016 09:15:02'
I converted to lower case first and then it worked like a charm.

Jan

@Johan Hagman: Of course more formats can be implemented. A modification of the code should be very simple for your case. I did not implement years with 2 digits, because this would require assumptions for the missing digits. But smart assumptions are definitely not the job of this tool. If you contact me by mail (address found in the code) and explain *exactly*, how you want the YY be completed to YYYY, I can provide a solution - if no "pivot years" or requests of the current date are required.

Is there any possibility of adding more date formats? The datenum function is way slow when handling >150 million dates, and our dates are sadly specified in 'dd-mmm-yy HH:MM:SS' or 'dd-mmm-yy HH:MM:SS.FFF', where in this function the input of just two digits for year is not enough. As specified by the formats all dates must be represented by 4 digits for years.

Aman Sethi

Alain

Jan this is truely great.

Have you reflected about datestr also? this one seems even slower - having a DateNum2Str would nicely complement...

Ingrid

reading in large csv-files with dates in the first column turned out to be extremely slow due to the datenum function and this file save the day

Felipe

Jan

@joh: What exactly is an "array of dates"? Cell strings are handles internally already.

joh

hi, would you use a for loop for an array of dates?

Perfect! I am using 15 minute interval stream records --- my data sets are around 0.5 million lines (Elapsed time is 0.019273 seconds to convert dates to number using DateStr2Num)

Jan

@James: I do not understand the question. Of course I've created the C-file, before it could be compiled by the mex command.
I suggest to post the error you get form the compiler and explain the required details.

James

Excellent work Jan! How did you create the c file before using the mex command? I keep getting errors from matlab compiler.

Thanks
-Jimmy

Hao Shen

Excellent stuff !!!!

Exactly what I was looking for! This is really good stuff

Very nice submission. It is enormously faster than datenum. Another great file from Jan.

Jan

To convert a serial date number to a '2011-08-24 23:38:44' date string:
sprintf('%.4d-%.2d-%.2d %.2d:%.2d:%.2d', datevecmx(now, 1));

Saad

ignore my previous comment...the string had some trailing whitespace which was leading the c program to throw an exception.

Saad

Hi, thanks for writing this, it works great. But it seems like the C compiled version doesn't accept a char string? I have a char string of 1x25 which works with the normal .m function but I get a warning that the format is not acceptable within the compiled version. I have to use the cellstr() function to convert, before passing to the compiled version. In the end, the run time between compiled and .m function are the same. Anyway you can adapt this?

Todd

My bad. Like Nate, I figured it out - what a difference! The command:
mex -O DateStr2Num.c
Generated a DateStr2Num.mexw64 file and wham! My code was at 90s using datenum. Now that section takes 0.02s. Thanks!

Nate Jensen

Sorry, I'm retarded, I figured it out.

Nate Jensen

Good function, I use it all the time. I am retarded at C though. Could you tell me how to run the C function from Matlab? Thanks.

Jan

@Reyna: The new '300' format contains milliseconds also.

Reyna

This is great. Is it possible to add millisec? Particularly to extend format 30 to yyyymmddTHHMMSS.FFF?

Updates

1.4.0.0

Month names considered in upper and lower case.

1.3.0.0

4 new formats, automatic compilation

1.2.0.0

New format "1000": dd-mmm-yyyy HH:MM:SS.FFF
Old format "300" called "1030" now.

1.1.0.0

New integer arithmetics for >50% more speed. Support of milliseconds in yyyymmddTHHMMSS.FFF format.

MATLAB Release Compatibility
Created with R2016b
Compatible with any release
Platform Compatibility
Windows macOS Linux