How to reshape efficiently Panel Data

Question

0 votes

I have a database where I store historical prices of ~1000 stocks for a +10yr period in a panel data format (nx4) ([Ticker] [Date] [Price] [Return]). I import the data into Matlab and then try to construct two historical matrices (one for Price and another for Returns) in the format (columns->tickers, rows->dates, values -> Price/Return). In order to do that I use the following code:

historical_returns; %panel data cell array imported from the database
historical_dates; %array that includes all historical dates
tickers; %array that includes all the tickers
Matrix_Prices = zeros(length(historical_dates),length(tickers));
Matrix_Returns = zeros(length(historical_dates),length(tickers));
for i=1:size(historical_returns,1)
temp_ticker = historical_returns{i,1};
temp_date = historical_returns{i,2};
temp_price = historical_returns{i,3};
temp_return = historical_returns{i,4};
row = find(strcmpi(historical_dates,temp_date));
column = find(strcmpi(tickers,temp_ticker));
Matrix_Prices(row,column) = temp_price;
Matrix_Returns(row,column) = temp_return;
end

The code above takes ~200sec to run assuming historical_returns has a size of 1mmx4 (which increases as the # of tickers and dates increase). I am trying to optimize the code (if possible), so I am not sure if there is a faster way to construct Matrix_Returns. I have thought of storing the data in a different format, but given the limit of column size in Access and SQL databases, I cannot create a new column for each ticker.

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Guillaume on 6 Jan 2015

Edited: Guillaume on 6 Jan 2015

Open in MATLAB Online

1 vote

You can replace the loop with the following:

Matrix_Prices = zeros(numel(historical_dates), numel(tickers));
Matrix_Returns = zeros(numel(historical_dates), numel(tickers));
[~, rows] = ismember(lower(historical_returns(:, 1)), lower(historical_dates)); %use lower case to make case insensitive comparison
[~, cols] = ismember(lower(historical_returns(:, 2)), lower(tickers)); 
indices = sub2ind(size(Matrix_Prices), rows, cols);
Matrix_Prices(indices) = cell2mat(historical_returns(:, 3));
Matrix_Returns(indices) = cell2mat(historical_returns(:, 4));

9 Comments
Show 7 older comments Hide 7 older comments

sm on 6 Jan 2015

Edited: sm on 7 Jan 2015

There is either a misunderstanding or I am missing something. Answering the question you asked before " How do you decide which row of historical_returns go into column 1 or column 2? ", notice that the historical_returns array does not include duplicates i.e. I store in the database only once 'AAPL' return per day. However, it might be the case that 'AAPL' has two entries in the position array for the reason explained above. Thus, when I try to construct the Matrix_Returns, the tickers might include duplicates ( tickers array include only ticker names and not the tags i.e. tickers = {'AAPL' 'MSFT' 'AAPL' 'IBM'}). Therefore, if for example 'AAPL' appears twice in the ticker array (column 1 and 3) then the Matrix_Returns column 1 and 3 should be identical since both reflect 'AAPL' return. That's what the original code does. To give you an idea about the end-goal, I have a position matrix (similar to Matrix_Returns) where I keep track of the position for each ticker (see Position matrix above). Finally, I multiply the Position matrix with the Matrix_Returns to estimate the portfolio's performance. By keeping non-unique entries it allows me to analyze particular trades (i.e. partial VaR etc), without any extra formatting. In order to clarify things further, assume that historical dates are unique, but tickers are not. As a result, if you check my original code, this implies that row will be 1x1 matrix, while column can be a 1xm matrix. The issue with your suggested code (which is noticeably faster) is that it does not allow duplicate values in tickers array.

Guillaume on 7 Jan 2015

Open in MATLAB Online

Ok, I don't really see the point of duplicating columns in the output matrices, but if that's what you want:

[u_tickers, ~, tickerpos] = unique(tickers);
%same code as before, but using u_tickers:
Matrix_Prices = zeros(numel(historical_dates), numel(u_tickers));
Matrix_Returns = zeros(numel(historical_dates), numel(tickers));
[~, rows] = ismember(lower(historical_returns(:, 1)), lower(historical_dates));
[~, cols] = ismember(lower(historical_returns(:, 2)), lower(u_tickers));
indices = sub2ind(size(Matrix_Prices), rows, cols);
Matrix_Prices(indices) = cell2mat(historical_returns(:, 3));
Matrix_Returns(indices) = cell2mat(historical_returns(:, 4));
%now replicate columns according to tickerpos:
Matrix_Prices(:, 1:numel(tickers)) = Matrix_Prices(:, tickerpos);
Matrix_Returns(:, 1:numel(tickers)) = Matrix_Returns(:, tickerpos);

sm on 7 Jan 2015

Thank you. The code above works great.

Sign in to comment.

Answer 2

Peter Perkins on 7 Jan 2015

Open in MATLAB Online

0 votes

This is a one-liner using the unstack function on a table. It would be something like

wideData = unstack(tallData,{'Price' 'Return'},'Ticker')

where tallData is a table with four variables, Ticker, Date, Price, and Return. Hope this helps.

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

How to reshape efficiently Panel Data

0 Comments
Show -2 older comments Hide -2 older comments

Accepted Answer

9 Comments
Show 7 older comments Hide 7 older comments

More Answers (1)

0 Comments
Show -2 older comments Hide -2 older comments

Categories

Tags

Community Treasure Hunt

How to reshape efficiently Panel Data

0 Comments Show -2 older comments Hide -2 older comments

Accepted Answer

9 Comments Show 7 older comments Hide 7 older comments

More Answers (1)

0 Comments Show -2 older comments Hide -2 older comments

Categories

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

9 Comments
Show 7 older comments Hide 7 older comments

0 Comments
Show -2 older comments Hide -2 older comments