Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

To resolve issues starting MATLAB on Mac OS X 10.10 (Yosemite) visit: http://www.mathworks.com/matlabcentral/answers/159016

Datenum - different date formats

Asked by Ms. Mat on 23 Dec 2012

Background

I have an excel sheet with different date formats (in excel some are in General format, some in date format totally random)

example: 10/05/2005 , 10/5/2005, 10/05/05, 05/12/2005 and so on.

I read the XL file into MATLAB as follows:

[dta_num , dta_txt , dta_raw] = xlsread(...)

I need to work with the dates in the excel sheet subsequently.

Actual Problem

I tried out the following commands

  • datenum('10/5/2005')
  • datenum('10/05/2005')
  • datenum('10/05/05')
  • datenum(dta_txt(2,2))

All of them return the same answer. (732590) But I realized that 732590 is read as 05-Oct-2005 when I ran

datestr(732590)

However the dates in my data sheet needs to be read as dd/mm i.e 10-May-2005.

If I choose to specify the format,

  1. * datenum('10/5/2005','dd/mm/yyyy') ans = 732442
  2. * datenum('10/05/2005','dd/mm/yyyy') ans = 732442
  3. * datenum('10/05/05','dd/mm/yyyy') ans = 1957

Originally I thought I will convert the dates using datenum and the after my manipulations using functions such as month(datenum(dta_txt(2,2),'dd/mm/yyyy')) I will use datestr and I will have all the dates in the same format. But now I am back to square 1 since datestr(1957) gives me a 2 digit year.

Also I want to make sure that date formats are not mixed up i.e dd/mm/yyyy and mm/dd/yyyy are mixed up. For this I plan to go over the date and see if the previous date belongs to the same month or one month earlier and no month is greater than 12 assuming a dd/mm format. If it is, then it is to be decoded as mm/dd/yyyy and appropriately changed to dd/mm/yyyy.

Is the only solution is to go over the length of the dta_txt contents using cellfun('length',dta_txt(:,2)) and change it to 4 year format (I guess there are very few in 2 year digit YY format)

To summarize my requirements are;

1. date format which takes less resource and easy to manipulate as I might want to extract and make comparisons of the the month and year later. 2. Have a consistent date format preferably dd/mm/yyyy.

2 Comments

Walter Roberson on 23 Dec 2012

I closed the other (earlier) post as there had been less commentary on it.

Ms. Mat

Products

No products are associated with this question.

2 Answers

Answer by Laura Proctor on 23 Dec 2012
Edited by Laura Proctor on 23 Dec 2012
Accepted answer

Would something like this help?

% Create a cell array containing dates in various formats
y = { '10/5/05' ; '18/2/2004' ; '3/3/2003' }
% Split up the elements in the cell array
[~,~,~,~,~,~,ss] = regexp(y,'/')
% Determine which dates have a two element year
ivals = cellfun(@(x) numel(x{3})==2,ss)
% Preallocate a vector to contain the dates
dates = zeros(size(y))
% Convert those dates with a four element year using datenum
dates(~ivals) = datenum(y(~ivals),'dd/mm/yyyy')
% Convert the dates with a two element year
dates(ivals) = datenum(y(ivals),'dd/mm/yy')

You could also use datevec to pull out the elements separately. Here I have assumed that any two element years that occur are later than 2000. If you also have dates that occur in the 1900s, then a little more logic is required.

% Create three vectors: year, month, and day
[year,month,day,~,~,~] = datevec(y,'dd/mm/yyyy')
% Logical array containing those year values that are less than 2000
lt2000 = year < 2000
% Add 2000 to any year that are less than 2000
year(lt2000) = 2000+year(lt2000)

3 Comments

José-Luis on 23 Dec 2012

An alternative solution would be to first manipulate your excel file so all the dates are stored as numbers. I know next to nothing about Excel so I can't really advice you on that. It should be straightforward, methinks.

Once you have the date as a serial number, then you can transform to a Matlab datenumber after importing and avoid all the potentially costly string comparisons.

bym on 23 Dec 2012

This works for 2 digit years

datenum('10/05/05','dd/mm/yyyy',2000)
ans =
        732442
Laura Proctor on 23 Dec 2012

You're right, proecsm - it does work if you just have one date. However, if you have a cell array of dates, it will not return the same results.

Walter posted a great solution in the duplicate post:

dates = datenum(y,'dd/mm/yyyy',2000)
Laura Proctor
Answer by per isakson on 23 Dec 2012
Edited by per isakson on 23 Dec 2012

There is no way for me (in Sweden) to know how to interpret

    '01/02/03'

or

    '01/02/2003'

IMO: when in doubt avoid to use the default values. It is better to specify the format an extra time than not to do it when needed.

IMO: the most precious resource is the time of the programmer.

In this case a vectorized code is faster than a loop. Experiment:

>> mysteries_dates_to_sdn
Elapsed time is 1.540982 seconds.
Elapsed time is 0.056104 seconds.
>> mysteries_dates_to_sdn
Elapsed time is 1.508512 seconds.
Elapsed time is 0.055982 seconds.
>> 

where

    %% mysteries_dates_to_sdn
    N = 1e3;
    castr = {
        '10/05/2005'
        '10/5/2005'
        '10/05/05'
        '05/12/2005'
        };
    %%
    castr = repmat( castr, [N,1] );
    %%
    tic
    sdn_loop = mysteries_dates_to_sdn_loop( castr );
    toc
    %%
    tic
    sdn_vec = mysteries_dates_to_sdn_vectorized( castr );
    toc
    %%
    assert( all( sdn_vec == sdn_loop )              ...
        , 'mysteries_dates_to_sdn:failure'          ...
        , 'Failure: sdn_loop differs from sdn_vec'  )

and

    function    sdn = mysteries_dates_to_sdn_loop( castr )
        sdn = nan( size( castr ) );
        for ii = 1 : numel( castr )
            str = castr{ii};
            len = length( str );
            if len >= 9
                sdn(ii) = datenum( str, 'dd/mm/yyyy' );
            else
                sdn(ii) = datenum( str, 'dd/mm/yy' );
            end
        end
    end

and

    function    sdn = mysteries_dates_to_sdn_vectorized( castr )
        sdn = nan( size( castr ) );
        len = cellfun( @length, castr );
        islong  = ( len >= 9 );
        isshort = not( islong );
        sdn(islong)  = datenum( castr(islong) , 'dd/mm/yyyy' ); 
        sdn(isshort) = datenum( castr(isshort), 'dd/mm/yy' ); 
     end

.

BTW: Why not use datevec rather than datenum to avoid calling the function, month

.

More:

This function (proposed above) is approx. 7% faster than mysteries_dates_to_sdn_vectorized

    function    sdn = mysteries_dates_to_sdn_pivot( castr )
        sdn = datenum( castr, 'dd/mm/yyyy', 2000 ); 
    end

0 Comments

per isakson

Contact us