# Importing specific rows of Data from Text file

24 views (last 30 days)

Show older comments

Hi;

I am having some sensor data which is a very large text (.dat) file. Some of the relevant data from this file needs to be analyzed and plotted through help of MATLAB.

The example for the data is like:

C 0 0.001 -0.02 24.09 4.64 -100.00 -100.00

C 0 1.005 0.29 24.09 4.43 -100.00 -100.00

C 0 2.009 -0.34 24.09 8.26 -100.00 -100.00

C 0 3.014 -0.18 24.06 6.06 -100.00 -100.00

C 0 4.018 0.07 24.06 5.61 -100.00 -100.00

C 0 5.022 0.02 24.09 4.92 -100.00 -100.00

C 0 6.026 0.34 24.12 4.28 -100.00 -100.00

C 0 7.030 -0.46 24.09 8.37 -100.00 -100.00

C 0 8.034 -0.23 24.09 7.50 -100.00 -100.00

R 0 60.275 -0.157674 -0.006891 0.000000 ......

Now all I want to import to MATLAB and analyze is the rows which start with this alphabet 'R' , which stands for Result. There is a pattern to the occurrence of 'Result' data in this big text file. The 'R' row occurs at an interval of every 160 rows.

How can I achieve this solution to import only these rows which tell the 'Result' into MATLAB, maybe interactively or programmatically. I would deeply appreciate a detailed answer as I am on intermediate level of MATLAB programming.

Thank you so much in advance! Pramit

##### 1 Comment

per isakson
on 7 Jan 2015

### Accepted Answer

per isakson
on 7 Jan 2015

Edited: per isakson
on 8 Jan 2015

If the entire file fits in memory, try this code

>> num = cssm()

num =

0 60.2750 -0.1577 -0.0069 0

0 60.2750 -0.1577 -0.0069 0

0 60.2750 -0.1577 -0.0069 0

0 60.2750 -0.1577 -0.0069 0

where

function out = cssm()

% read the entire file to one cell array with one row per cell

fid = fopen( 'cssm.txt', 'r' );

cac = textscan( fid, '%s', 'Delimiter', '\n' );

[~] = fclose( fid );

% find rows which begin with 'R'.

isR = cellfun( @(str) strncmp(strtrim(str),'R',1), cac{:} );

% extract the rows beginning with 'R'

rlt = cac{:}(isR);

% join all rows with results to one long string separated by '\n'

one_str = strjoin( rlt, '\n' );

% parse the string.

result = textscan( one_str, '%c%f%f%f%f%f', 'CollectOutput',true );

% make sure that only results are included in the output

assert( strcmp( unique(result{1}), 'R' ) ...

, 'Non-result rows included in result' )

out = result{2};

end

and where cssm.txt contains

C 0 6.026 0.34 24.12 4.28 -100.00 -100.00

C 0 7.030 -0.46 24.09 8.37 -100.00 -100.00

C 0 8.034 -0.23 24.09 7.50 -100.00 -100.00

R 0 60.275 -0.157674 -0.006891 0.000000

C 0 6.026 0.34 24.12 4.28 -100.00 -100.00

C 0 7.030 -0.46 24.09 8.37 -100.00 -100.00

C 0 8.034 -0.23 24.09 7.50 -100.00 -100.00

R 0 60.275 -0.157674 -0.006891 0.000000

C 0 6.026 0.34 24.12 4.28 -100.00 -100.00

C 0 7.030 -0.46 24.09 8.37 -100.00 -100.00

C 0 8.034 -0.23 24.09 7.50 -100.00 -100.00

R 0 60.275 -0.157674 -0.006891 0.000000

C 0 6.026 0.34 24.12 4.28 -100.00 -100.00

C 0 7.030 -0.46 24.09 8.37 -100.00 -100.00

C 0 8.034 -0.23 24.09 7.50 -100.00 -100.00

R 0 60.275 -0.157674 -0.006891 0.000000

 

... and an alternative, which is an order of magnitude faster

function out = faster()

% read the entire file to one string

str = fileread( 'cssm.txt' );

% find start and end indicies of all the "rows" beginning with 'R'

xpr = '(?<=\s)R[^(\n|\r)]+(\n|\r){1,2}';

[ix1,ix2] = regexp( str, xpr, 'start', 'end' );

% extract the "rows" beginning with 'R'

isi = false(1,length(str));

for ii = 1:length(ix1)

isi(ix1(ii):ix2(ii))=true;

end

one_str = str(isi);

% parse the string.

result = textscan( one_str, '%c%f%f%f%f%f', 'CollectOutput',true );

% make sure that only results are included in the output

assert( strcmp( unique(result{1}), 'R' ) ...

, 'Non-result rows included in result' )

out = result{2};

end

### More Answers (3)

Shoaibur Rahman
on 18 Dec 2014

I think the following code will serve your purpose. I assume that the text file is named as textFile.txt , and is saved in your working directory, otherwise add the file path.

A few things about the code for your better understanding (yet, if you may have questions, please feel free to contact me):

- cellData is your text data in cellular form.
- First for loop finds the starting row of your data.
- Second set of for loops generates a matrix ResultData that contains all your result data, so you can use that matrix for further analyses. Each row of ResultData corresponds to each your result row in the original text file, except the name R.

filename = '/textFile.txt';

delimiter = ' ';

formatSpec = '%s%f%f%f%f%f%f%f%[^\n\r]';

fileID = fopen(filename,'r');

dataArray = textscan(fileID, formatSpec, 'Delimiter', delimiter, 'MultipleDelimsAsOne', false, 'ReturnOnError', false);

fclose(fileID);

DataIndex = 2:8;

dataArray(DataIndex) = cellfun(@(x) num2cell(x), dataArray(DataIndex), 'UniformOutput', false);

cellData = [dataArray{1:end-1}];

for k = 1:size(cellData,1)

if cellData{k,1} == 'R'

RstartRow = k;

break

end

end

R_rows = RstartRow:160:size(cellData,1);

for k = 1:length(R_rows)

for kk = 2:size(cellData,2);

ResultData(k,kk-1) = cellData{k,kk};

end

end

##### 3 Comments

Shoaibur Rahman
on 28 Dec 2014

Hi,

Thank you. Lets discuss this together, and to do so, we first take a simple example:

out = cellfun(@mean, {1:10,1:5})

This computes the mean of the two vectors 1:10 and 1:5. Each output is of same size, type and scaler, so 'UniformOutput' will be true, which is default.

Now, dataArray = textscan(fileID, formatSpec, 'Delimiter', delimiter, 'MultipleDelimsAsOne', false, 'ReturnOnError', false); returns dataArray with different size and type (both cell and double).

To convert all of them into cell, we use a function handle defined by @(x) num2cell(x), where x is a dummy variable and is used to pass dataArray(DataIndex).

We also want to do this only with the data we are interested with, which is set by dataArray(DataIndex).

Finally, because each output is nonscalar and may be of different size, set UniformOutput to false.

Sean de Wolski
on 16 Dec 2014

##### 0 Comments

Sudharsana Iyengar
on 18 Dec 2014

Edited: Sudharsana Iyengar
on 18 Dec 2014

I dont know if this would help. when i looked at your sensor data the first column consisted of strings while remaining columns consisted of numbers. you can do two of the following.

1) open your csv file in excel and arrange it asccending or descending and pick up the values manually.

2) you can set the string values into their ascii form. In your data C and M were there.ascii value for C is 01000011 and M is 01001101 and for R is 01010010. After making the transformation. you can import your data into matlab.

This would be stored as matrix. with first column having ascii values and remaing 7 with numbers. then you can use the following code.

your data will be stored as untitled.

j=1;k=1;l=1;

for i=1:length(untitled(:,1))

if untitled(i,1)== 01000011

B(j,1)=untitled(i,2);B(j,2)=untitled(i,3);B(j,3)=untitled(i,4);B(j,4)=untitled(i,5);B(j,5)=untitled(i,6);B(j,6)=untitled(i,7);j=j+1; %storing the remaining 7 columns as a separate varaible

end

if untitled(i,1)==01001101

C(k,1)=untitled(i,2);C(k,2)=untitled(i,3);C(k,3)=untitled(i,4);C(k,4)=untitled(i,5);C(k,5)=untitled(i,6);C(k,6)=untitled(i,7);k=k+1;

end

if untitled(i,1)==01010010

D(l,1)=untitled(i,2);D(l,2)=untitled(i,3);D(l,3)=untitled(i,4);D(l,4)=untitled(i,5);D(l,5)=untitled(i,6);D(l,6)=untitled(i,7);l=l+1;

end

end

This will create 2 files B C and D for separate C M and R values. Let me know if this was help full.

##### 0 Comments

### See Also

### Categories

### Products

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!