Can REGEXP be utilized in searching M x N matrix values?

1 view (last 30 days)
The following text file contains numerous lines of data, including a 9 x 9 state covariance matrix:
State Time: 13267.000
State Date: 20120101
State Covariance:
-0.11000000000000e-01 0.12000000000000e01 0.13000000000000e00 0.14000000000000e-01 0.15000000000000e-01 0.16000000000000e-01 0.17000000000000e-01 0.18000000000000e-01 0.19000000000000e-01
0.21000000000000e-01 0.22000000000000e-01 0.23000000000000e-01 0.24000000000000e-01 0.25000000000000e-01 0.26000000000000e-01 0.27000000000000e-01 0.28000000000000e-01 0.29000000000000e-01
0.31000000000000e-01 0.32000000000000e-01 0.33000000000000e-01 0.34000000000000e-01 0.35000000000000e-01 0.36000000000000e-01 0.37000000000000e-01 0.38000000000000e-01 0.39000000000000e-01
0.41000000000000e-01 0.42000000000000e-01 0.43000000000000e-01 0.44000000000000e-01 0.45000000000000e-01 0.46000000000000e-01 0.47000000000000e-01 0.48000000000000e-01 0.49000000000000e-01
0.51000000000000e-01 0.52000000000000e-01 0.53000000000000e-01 0.54000000000000e-01 0.55000000000000e-01 0.56000000000000e-01 0.57000000000000e-01 0.58000000000000e-01 0.59000000000000e-01
0.61000000000000e-01 0.62000000000000e-01 0.63000000000000e-01 0.64000000000000e-01 0.65000000000000e-01 0.66000000000000e-01 0.67000000000000e-01 0.68000000000000e-01 0.69000000000000e-01
0.71000000000000e-01 0.72000000000000e-01 0.73000000000000e-01 0.74000000000000e-01 0.75000000000000e-01 0.76000000000000e-01 0.77000000000000e-01 0.78000000000000e-01 0.79000000000000e-01
0.81000000000000e-01 0.82000000000000e-01 0.83000000000000e-01 0.84000000000000e-01 0.85000000000000e-01 0.86000000000000e-01 0.87000000000000e-01 0.88000000000000e-01 0.89000000000000e-01
0.91000000000000e-01 0.92000000000000e-01 0.93000000000000e-01 0.94000000000000e-01 0.95000000000000e-01 0.96000000000000e-01 0.97000000000000e-01 0.98000000000000e-01 0.99000000000000e-01
State Confidence: -1
I’m looking for a method to extract the 9x9 matrix values and save them for processing. I’m using the following code to extract the matrix:
% Prompt user for file to open
[fn,pn]=uigetfile('*.*','Select SNER Text File');
filename = fullfile(pn,fn);
% Read in entire file
buffer = fileread(filename);
% Parse out the Matrix (9x9) and store as SCD
pattern = '*?State Covariance:\s+([-?\d\.]+\w[-?\d\.]+).([-?\d\.]+\w[-?\d\.]+).([-?\d\.]+\w[-?\d\.]+).([-?\d\.]+\w[-?\d\.]+).([-?\d\.]+\w[-?\d\.]+).([-?\d\.]+\w[-?\d\.]+).([-?\d\.]+\w[-?\d\.]+).([-?\d\.]+\w[-?\d\.]+).([-?\d\.]+\w[-?\d\.]+)';
tokens4 = regexp(buffer, pattern, 'tokens');
SCD = reshape(str2double([tokens4{:}]), 9, []).';
This code, even with the long search pattern, does a great job in finding all nine matrix values in row 1 and saving them. But how would I go about extracting the remaining rows? Is there a more practical approach than using the REGEXP for this task?

Accepted Answer

per isakson
per isakson on 7 Apr 2014
Edited: per isakson on 7 Apr 2014
If the entire file fits in memory I recommend that you
  • read the entire file as character
  • extracts the "numerical chunks" (an appropriate job for regexp)
  • convert to double
Try
>> M = cssm();
>> whos M
Name Size Bytes Class Attributes
M 9x9x3 1944 double
where cssm.m is
function M = cssm()
xpr = '(?<=State Covariance:)[ 0-9\.\+\-e\r\n]++(?=State Confidence:)';
str = fileread( 'cssm.txt' );
cac = regexp( str, xpr, 'match' );
M = nan( 9, 9, length( cac ) );
for jj = 1 : length( cac )
M( :, :, jj ) = str2num( cac{jj} );
end
end
and cssm.txt contains copy&paste three times of the data of your question
.
.
"contains numerous lines of data" made me think your file contains many matrices. However, the code above shall work with one.

More Answers (1)

dpb
dpb on 7 Apr 2014
Surely it could be made to work, but is way overkill for the job...
>> c=textread('brad.txt','%f',81,'headerlines',6);
>> c=reshape(c,9,[]).'
c =
-0.0110 1.2000 0.1300 0.0140 0.0150 0.0160 0.0170 0.0180 0.0190
0.0210 0.0220 0.0230 0.0240 0.0250 0.0260 0.0270 0.0280 0.0290
...
0.0810 0.0820 0.0830 0.0840 0.0850 0.0860 0.0870 0.0880 0.0890
0.0910 0.0920 0.0930 0.0940 0.0950 0.0960 0.0970 0.0980 0.0990
>>
The above does presume you know the size a priori...if not need a little more work to not run off into the weeds at the end...

Categories

Find more on Large Files and Big Data in Help Center and File Exchange

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!