Replace missing values by 0

15 views (last 30 days)
Manuel Aloy
Manuel Aloy on 8 Jun 2015
Edited: Walter Roberson on 8 Jun 2015
Hello everybody
I importing some .txt data from Bloomberg to Matlab. The problem is that in some cases there is a value missing in my .txt file and Matlab stops reading. I would love to automatically replace empty values (cells) in my .txt file by 0's so that Matlab keeps reading.
I attach the file I want to import and the code. As you will see there is a missing value in line 45 column H. Matlab stops reading exactly in that place and delivers "Subscripted assignment dimension mismatch". .
I hope you can help me.
Best regards
fid = fopen( ['file1.txt'] );
N_beg = 8;
N_end = 49;
Data = zeros(44,5); %44 = 10 years and 4 quartely results per year
for i = 1 : N_end
tline = fgets(fid);
if(i >= N_beg)
ii = i-N_beg + 1;
A = strsplit(tline,'/');
AA = strsplit(A{1,3});
B = strsplit(A{1,4});
C = A{1,4};
D = B{1,4};
for j = 1 : 2;
Data(i-N_beg+1,j) = str2num(A{1,j});
end
Data(i-N_beg+1,3) = str2num(AA{1,1});
Data(i-N_beg+1,4) = str2num(B{1,2});
Data(i-N_beg+1,5) = str2num(D); %Earnings surprise
end
end

Accepted Answer

Stephen23
Stephen23 on 8 Jun 2015
Edited: Stephen23 on 8 Jun 2015
Although the data is formatted rather sloppily, it is possible to read the whole file at once using textscan:
fid = fopen('file1.txt','rt'); % specify the permission!
C = textscan(fid, '%s%s%d%s%*u%f%f%f%s%*u%*u%s%f%f', 'HeaderLines',2, 'Delimiter',{' ','\t'}, 'CollectOutput',true);
fclose(fid);
This gives the following data (for clarity only the last seven lines of each array are shown):
>> C
C =
{48x2 cell} [48x1 int32] {48x1 cell} [48x3 double] {48x2 cell} [48x2 double]
>> C{1}(42:48,:)
ans =
'04/04/2006' 'Q4'
'01/19/2006' 'Q3'
'01/19/2006' 'Q2'
'05/05/2005' 'Q1'
'02/03/2005' 'Q4'
'10/28/2004' 'Q3'
'N.A.' 'Q2'
>> C{2}(42:48,:)
ans =
5
5
5
5
4
4
4
>> C{3}(42:48,:)
ans =
'12/05'
'09/05'
'06/05'
'03/05'
'12/04'
'09/04'
'06/04'
>> C{4}(42:48,:)
ans =
0.2300 0.2800 0.2070
0.3100 0.3500 NaN
0.1100 0.1100 0.1370
0.1900 0.1700 0.1850
0.1000 0.2700 0.1530
0.1400 0.2000 0.2300
0.2000 0.1000 0.1000
>> C{5}(42:48,:)
ans =
'35.27%' '-0.76%'
'' '5.49%'
'-19.71%' '5.49%'
'-8.11%' '-1.87%'
'76.47%' '3.18%'
'-13.04%' '-6.33%'
'0.00%' ''
>> C{6}(42:48,:)
ans =
0.8400 18.8500
0.7100 23.1400
0.5400 30.3300
0.6300 26.0000
NaN NaN
NaN NaN
NaN NaN
And then the percentages can be converted to numeric values (divide by 100 to get the decimal equivalent):
>> str2double(regexprep(C{5},'%$',''))
ans =
NaN NaN
NaN NaN
NaN NaN
NaN NaN
1.2100 0.1500
12.9500 5.2500
7.8700 -6.3800
1.0800 0.2100
-11.7600 -2.7000
-1.3600 -0.8400
14.0400 -2.4900
15.1100 2.0200
-6.4700 -4.8900
6.6700 6.3000
3.4500 -3.9100
-33.8200 -4.4100
35.0400 -1.0600
5.5000 -0.7200
2.6600 2.8500
5.6600 0.6500
-16.3500 1.6200
-9.0900 0.9000
-21.5700 0.4200
7.4800 -0.1900
4.0000 -7.6600
-56.7600 -0.0900
2.6600 1.2300
37.9300 5.6300
80.0000 18.1800
NaN -9.4800
-5.1100 28.2900
-39.9300 -2.0600
31.8700 6.5800
20.9300 -8.6900
-36.4000 2.7800
40.1000 1.6300
-18.6400 -1.3700
-45.4500 -3.1800
64.2500 3.3300
62.9200 9.9700
97.1800 2.4200
35.2700 -0.7600
NaN 5.4900
-19.7100 5.4900
-8.1100 -1.8700
76.4700 3.1800
-13.0400 -6.3300
0 NaN

More Answers (0)

Categories

Find more on Large Files and Big Data in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!