Application of codes on big ASCII file.

1 view (last 30 days)
Pap
Pap on 4 Apr 2011
Hi all,
% I work the below code serie :
% Read stock data from file fid = fopen('stocks.txt'); data = textscan(fid,'%s%s%f%f%f% [^\n]','delimiter',' ','headerlines',1);
% Read as text (for later writing)
frewind(fid);
txt = textscan(fid,'%s','delimiter','\n');
fclose(fid);
% Get prices from imported data
Price = data{4};
% Determine which stocks to buy
buy = [true;diff(Price)>=0];
idx = find(~diff(Price));
buy(idx+1) = buy(idx);
% Make string of trade decision
buysell = cellstr(repmat(' Sell',size(Price)));
buysell(buy) = {' Buy'};
% Open file for writing
fid = fopen('stocks2.txt','wt');
% Make output string by appending trade decision
outstr = strcat(txt{1},[' Trade';buysell]);
% Write out
fprintf(fid,'%s\n',outstr{:});
fclose(fid);
However, while the above works perfectly with a small ASCII file (example), it seems not to work with a big ASCII (almost 540 MB).
Actually when I try to define the column 'Price' (column 4) for imported data, I get a value ' [ ]' while I expeceted something like '15000000x1 double' ,since the ASCII consists of nearly 15000000 rows. ( When the above is applied to the sample ASCII, I indeed receive a 830281x1 double value, since the rows are 830281). As such I can't also the 'idx' code.
I can't also define 'data', 'txt', 'buysell' and 'outstr' since the created cell arrays are 6x1, 2x1 etc and not applicable to the total rows.
Is there any way to overcome this problem and work with such a file?
Many thanks in advantage,
Panos
  1 Comment
Walter Roberson
Walter Roberson on 4 Apr 2011
Please go in to the editor and select the code and click on the 'Code {}' button, in order to make the code readable to the rest of us.

Sign in to comment.

Answers (1)

Walter Roberson
Walter Roberson on 4 Apr 2011
Please post the first few lines of the file you are having trouble with.
  1 Comment
Pap
Pap on 4 Apr 2011
Thanks Walter,
The first rows of the original (big) ASCII are:
STOCK DATE TIME PRICE VOLUME MARKET
??? 04/01/2010 10293955 18.34 500 ?????? ?????.
??? 04/01/2010 10293955 18.34 70 ?????? ?????.
??? 04/01/2010 10293955 18.34 430 ?????? ?????.
??? 04/01/2010 10293955 18.34 200 ?????? ?????.
??? 04/01/2010 10293955 18.34 100 ?????? ?????.
??? 04/01/2010 10293955 18.34 40 ?????? ?????.
??? 04/01/2010 10293955 18.34 215 ?????? ?????.
Please also note that I did it without a delimiter specification in textscan due to the mixture of delimiters (tab, space etc)in the file.
PS: The last column MARKET is simply in Greek (not a possible error in copy/paste procedure) .
Thanks again,
Panos

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!