Reading text file of different columns as empty in matlab

My text file consists on 31*13 matrix, Problem is 3nd column which has length of 28, columns no 3,5,8,10,12 are of length 30, Actually this is daily temperature data for one year, I tried, but getting different biases?
Possible read it in matlab, and place NaN value where empty value appear?
Text file has been attached. Looking to hear from you people. Thanks

 Accepted Answer

It used to be tricky to read files like this one with Matlab. I don't know whether The Mathworks provided a solution in a recent release, but I don't think so. I attach a file, which is on-going-work, but it seems to do the job. The file may be called in two ways.
cac = read_fixed_format( 'CHLMIN00.TXT' ...
, '%10f%9f%9f%9f%9f%9f%9f%9f%9f%9f%9f%9f%9f' ...
, 'Headerlines',1 );
cac = read_fixed_format( 'CHLMIN00.TXT', '%10f12(%9f)', 'Headerlines',1 );
test
>> cac{3}'
ans =
Columns 1 through 9
3.4000 3.9000 4.4000 3.1000 2.8000 2.2000 3.1000 3.6000 3.5000
Columns 10 through 18
3.9000 4.2000 3.9000 3.3000 3.3000 5.0000 6.1000 5.0000 5.6000
Columns 19 through 27
4.4000 3.9000 2.8000 3.9000 4.7000 3.9000 3.9000 4.4000 5.0000
Columns 28 through 31
5.0000 5.3000 NaN NaN
&nbsp
Does MATLAB offer a user friendly way to read&nbsp CHLMIN00.TXT ? &nbsp I made the test below with R2016a.
YES, interactively Import Data gets the "empty entries" right, but it seems as is the first empty line of the text file ends up as a row of NaNs at the bottom of the data.
&nbsp
HOWEVER, the function, importdata, makes a mess of the "empty entries". On the other hand, it's not confused by the first empty line.
>> data = importdata( 'CHLMIN00.TXT' );
>> whos data
Name Size Bytes Class Attributes
data 31x13 3224 double
>> data(26:end,3)'
ans =
4.4000 5.0000 5.0000 5.3000 12.2000 10.1000
>>
I'm not amused!

13 Comments

my test
cac = read_fixed_format( 'CHLMIN00.TXT', '%10f12(%9f)', 'Headerlines',1 );
error
Undefined function 'ixStrFindVec' for input arguments of type 'char'.
Error in read_fixed_format (line 109)
ixc = ixStrFindVec( str, repmat( char(32), sz(1), 1 ) );
other test
Undefined function 'ixStrFindVec' for input arguments of type 'char'.
Error in read_fixed_format (line 109)
ixc = ixStrFindVec( str, repmat( char(32), sz(1), 1 ) );
Here it is (and in the answer). I think that's the only one missing.
ThankU dear for sharing this noble work, Will you please tell me what is the function of expand_format_spec.m ?
Also I think, Matlab must accommodate this kind of data set dealing in new release.
Possible I save this 31*13 matrix file in text file again.
ThankU again
It expands '%10f12(%9f)' to a format specifier that textscan needs.
%10f means 10 decimal places of float data type? 12(%9f) means 112 decimal places with float?
This way of calling will be a new learning for me?
'%10f12(%9f)' is shorthand for '%10f%9f%9f%9f%9f%9f%9f%9f%9f%9f%9f%9f%9f', which is a legal format specifier with textscan. I like the short form, because I find it hard both to write and read the latter format string. Thus, in this context '12(%9f)' produces the same string as repmat('%9f',1,12).
%9f &nbsp"takes" the next 9 characters (incl. delimiters and spaces) and interprets these 9 characters as double. See: textscan
Dear @per isakson;
Below are the text files when I open through read_fixed_format.m it does not creating a cell array of 31*1, I have attached this text file with this comment,
My try was to reopen this text file into excel and then save, but when i just read through read_fixed_format.m, not reading it into 31*1 cell arrays
data = read_fixed_format('SKDMIN02.TXT' , '%10f%9f%9f%9f%9f%9f%9f%9f%9f%9f%9f%9f%9f', 'Headerlines',1 );
tell me what kind of this problem is it?
Thank you always
See the test of importdata in my answer.
It's Herculean task to make a foolproof function that automagically reads any text file. Had it been easy, importdata would have been much more skillful.
read_fixed_format is dumb and relies totally on the instructions of the user, i.e the format specifier.&nbsp There are special reasons for each of 10, 9 and f.
SKDMIN02.TXT is a tab-delimited file, which read_fixed_format cannot read.&nbsp ( importdata can.)&nbsp However, the "empty values" are stacked at the end of the bottom lines. The result will be useless.
&nbsp
ASTMAX93.TXT has fixed width columns and read_fixed_format can read it given a correct format string.
I am confuse with correct format string? I am using read_fixed_format to read multiple text files at once, How can i correct this ASTMAX93.TXT with your code?
Will you please tell me how u able to know this file is tab-delimated or has fixed column?
Thanks
" ... to know this file is tab-delimated or has fixed column?" &nbsp Either, the info is supplied together with the files (in the best of worlds) or you have to find out yourself. I used the editor, notepad++, to inspect the files.
For files with fixed width columns you need to know the width and type of data of each column. See my comment per isakson on 15 Sep 2016 at 0:40.
"how u able to know this file is tab-delimated or has fixed column?"
Use Notepad++: View -> Show Symbol -> Show All Characters.
"to know this file is tab-delimated " and "multiple text files at once" &nbsp Try this
>> [ has, cnt ] = has_tabs( 'ASTMAX93.TXT' )
has =
0
cnt =
0
>> [ has, cnt ] = has_tabs( 'SKDMIN02.TXT' )
has =
1
cnt =
372
where
function [ has, cnt ] = has_tabs( filespec )
str = fileread( filespec );
ist = sprintf('\t') == str;
has = any( ist );
cnt = sum( double( ist ) );
end
"correct format string" &nbsp Try
>> format_spec = search_format_specifier( 'SKDMIN02.TXT' )
Error using search_format_specifier (line 20)
The column widths of the rows 2 through 28 in "SKDMIN02.TXT" differ
>>
>> format_spec = search_format_specifier( 'ASTMAX93.TXT' )
format_spec =
%7f%6f%6f%6f%6f%6f%6f%6f%6f%6f%6f%6f
>>
>> format_spec = search_format_specifier( 'CHLMIN00.TXT' )
format_spec =
%10f%9f%9f%9f%9f%9f%9f%9f%9f%9f%9f%9f
where
function format_spec = search_format_specifier( filespec, rows )
%
narginchk( 1, 2 )
if nargin == 1
rows = [ 2, 28 ];
end
%
fid = fopen( filespec, 'r' );
cac = textscan( fid, '%s', diff(rows)+1, 'Whitespace','' ...
, 'Delimiter','\n', 'Headerlines',rows(1)-1 );
[~] = fclose( fid );
%
cix = regexp( cac{1}, '(?<=\d)([ ]{1}|$)', 'start' );
cix = unique( cell2mat( cix ), 'rows' );
%
assert( size(cix,1) == 1, 'search_format_specifier:NotFixedWidth' ...
, 'The column widths of the rows %d through %d in "%s" differ' ...
, rows(1), rows(2), filespec )
%
col_width = diff([ 1, cix ]);
format_spec = sprintf( '%%%df', col_width );
end

Sign in to comment.

More Answers (1)

On using
data = importdata(txtfile) ;
in data default wherever data is missing NaN is introduced. I tried this in MATLAB2015a.

1 Comment

I have tried , but in 3rd column importdata is just interpolating dataset, not producing nan for missing values. Plz try this by ur hands plz

Sign in to comment.

Categories

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!