Import Text File Data With Blank Rows

6 views (last 30 days)
Nathanael
Nathanael on 30 Apr 2014
Edited: Cedric on 1 May 2014
I am having trouble importing a data set from a text file with blank rows. The data is in file "set1.txt" (attached) formatted as follows:
X data Y data
-----------------
1 5
2 7.8
3 2.1
X data2 Y data2
-----------------
1 2
2 2.4
3 8
When I use
G=importdata('set1.txt');
I have the first set of data contained in
>>G.data
ans=
1 5
2 7.8
3 2.1
Everything below the blank (empty) row is ignored. What I'd ultimately like to do is import each chunk of data into either its own n x 2 matrix, or, into an appended matrix such that all the data in set1.txt is imported into a matrix with the following format:
1 5 1 2
2 7.8 2 2.4
3 2.1 3 8
Any help would be much appreciated!

Accepted Answer

Cedric
Cedric on 1 May 2014
Edited: Cedric on 1 May 2014
Run the following
content = fileread( 's1.txt' ) ;
blocks = regexp( content, '-[\n\r]+([^X]*)', 'tokens' ) ;
blocks = [blocks{:}] ;
and look at the content of blocks{1} and blocks{2}. Then see Per's answer in the thread that he links above. You can proceed the same way; I just wanted to provide you with a pattern which matches your setup, as it can be tricky to build if you are not familiar regular expressions.
  2 Comments
Nathanael
Nathanael on 1 May 2014
Thanks guys. Between the two of you I got it working. I'm not familiar with these expressions, not a straight forward solution but it works! Thanks.
Cedric
Cedric on 1 May 2014
Edited: Cedric on 1 May 2014
The pattern defines:
  • match the dash char: -
  • followed by one or more char that is either a line break or a carriage return: |[\
]+|
  • then take as many chars as possible which are not X: [^X]*
  • and extract them as a token: ()
The regexp engine matches this pattern, extracts the token, and then goes on matching and extracting until the end of the content. While doing that, it stores tokens in a cell array. As a match can contain multiple tokens, the output cell array is a cell array of cell arrays. Each cell contains a cell array of all tokens related to a match. In your case, there is only one token per match, so each internal cell array contains only one cell. This is why I "flatten" blocks afterwards, so you end up having a cell array of tokens (instead of a cell array of cell arrays of tokens ;-))

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!