New column

4 views (last 30 days)
Pap
Pap on 22 Mar 2011
Hello,
I work with the below part of a txt file (since the original is huge one):
Stock Date Time Price Volume Stock Category >ETE 04/01/2010 10145959 18.31 500 Big Cap >ETE 04/01/2010 10150000 18.01 70 Big Cap >ETE 04/01/2010 10170000 18.54 430 Big Cap >ABC 04/01/2010 10190000 18.34 200 Big Cap >YYY 04/01/2010 10200000 18.34 100 Big Cap >ETE 04/01/2010 10250000 18.31 40 Big Cap >ETE 04/01/2010 10295959 18.74 215 Big Cap >ETE 04/01/2010 10300000 18.74 500 Big Cap >ETE 04/01/2010 10320000 18.34 500 Big Cap
% I need to create a new variable (column six, let's say 'TRADE'. It's first value will be arbitrarilly asigned to 'BUY' (is there any code to do that?). Then I need its value to be 'BUY' if the row's value of Price(column 4) is higher than the previous row's value of Price. If price is lower, then it would be 'SELL'. In case of equal prices then if value in new column 7 (Trade) of previous row is 'BUY', then it will be 'BUY', otherwise 'SELL' (that's the why I arbitrarilly define the first value),
so the sample will looking like:
Stock Date Time Price Volume Stock Category Trade >ETE 04/01/2010 10145959 18.31 500 Big Cap BUY >ETE 04/01/2010 10150000 18.01 70 Big Cap SELL >ETE 04/01/2010 10170000 18.54 430 Big Cap BUY >ABC 04/01/2010 10190000 18.34 200 Big Cap SELL >YYY 04/01/2010 10200000 18.34 100 Big Cap SELL >ETE 04/01/2010 10250000 18.31 40 Big Cap SELL >ETE 04/01/2010 10295959 18.74 215 Big Cap BUY >ETE 04/01/2010 10300000 18.74 500 Big Cap BUY >ETE 04/01/2010 10320000 18.34 500 Big Cap SELL
Any help?
Thanks in advance,
Panos

Accepted Answer

Matt Tearle
Matt Tearle on 22 Mar 2011
I think this does what you want:
Price = randi(10,20,1)
buy = [true;diff(Price)>=0];
idx = find(~diff(Price));
buy(idx+1) = buy(idx);
[Price,buy]
I'm using a logical array buy that is true for "buy" and false for "sell". You could use a nominal array (if you have Statistics Toolbox) to assign arbitrary labels (ie "buy" and "sell"), but logical probably does what you want the easiest.
EDIT: if you really want a column of strings (for output purposes), this will do it:
buysell = cellstr(repmat('Sell',size(Price)));
buysell(buy) = {'Buy'}
  3 Comments
Pap
Pap on 1 Apr 2011
Hi Matt,
May I ask what the first row of the above code ('randi(10,20,1)')pertains to?
Does this specifies the No of rows to generate this random value?
I am actually trying to apply this to larger dataset but I get the same output.
- Can I apply the above if I do not know exactly the No of Rows (because I work with a huge ASCII dataset)?
- How can I put the output (cellarray) into the original ASCII file ?
Many thanks
Panos
Matt Tearle
Matt Tearle on 2 Apr 2011
The first line was just to make some example price data. Remove it and use your data.
See my new answer for the whole process of reading and writing.

Sign in to comment.

More Answers (1)

Matt Tearle
Matt Tearle on 2 Apr 2011
% Read stock data from file
fid = fopen('stocks.txt');
data = textscan(fid,'%s%s%f%f%f%[^\n]','delimiter',' ','headerlines',1);
% Read as text (for later writing)
frewind(fid);
txt = textscan(fid,'%s','delimiter','\n');
fclose(fid);
% Get prices from imported data
Price = data{4};
% Determine which stocks to buy
buy = [true;diff(Price)>=0];
idx = find(~diff(Price));
buy(idx+1) = buy(idx);
% Make string of trade decision
buysell = cellstr(repmat(' Sell',size(Price)));
buysell(buy) = {' Buy'};
% Open file for writing
fid = fopen('stocks2.txt','wt');
% Make output string by appending trade decision
outstr = strcat(txt{1},[' Trade';buysell]);
% Write out
fprintf(fid,'%s\n',outstr{:});
fclose(fid);
  4 Comments
Pap
Pap on 5 Apr 2011
Hi Matt,
I also applied the above code on a big txt file, (540 MB, with almost 12,000,000 rows and it doesn't seem to work. Actually for data=textscan.... I get a 6x1 cell array (headers only) and not the data, as with the sample file (830120x1 cell array). Also in the definition of column 4 (Price) I do not get values as in the sample file but I get '[]' instead. As such I also get '[]' in the idx=.., etc.
Any himt on what may i did wrong?
Is there any limitation in data for matlab?
Panos
Matt Tearle
Matt Tearle on 6 Apr 2011
I've added an answer to your new question about this, but as an aside here: data should be a 1-by-6 cell array. Each cell should contain an n-by-1 array of appropriate type (cell or double).

Sign in to comment.

Categories

Find more on Financial Toolbox in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!