MATLAB Answers

0

Extracting parts of a string

Asked by Ana Maria Alzate on 15 Jun 2018
Latest activity Edited by Stephen Cobeldick on 18 Jun 2018
I have a text filewith information like this:
FileName; SampleFreq; Test;Modality;Channel;Description;StimIntensity; Position; RecordingTime
C:\Users\G10040419\Desktop\lp export application\Data 139\00000090_1.WAV; 22000; 2;1;1;5 CH Right; 0.00; -10000; 40147.491374
I need to extract the sampleFreq (22000) and the position (-10000). I tried to use regular expressions, but I cannot find specific delimiter for these data.

  0 Comments

Sign in to comment.

4 Answers

Answer by Paolo
on 15 Jun 2018
Edited by Paolo
on 15 Jun 2018
 Accepted Answer

The following code uses regexp to extract the data you want. You can play around with the expression here .
data = fileread('00000090Head.txt');
expression = '(?<=WAV;\s*)(\d*)(?:;\s*\d*;\d;\d;(.*?(?=;));\s*\d*\.\d*;\s*)(-?\d*)';
[tokens,match] = regexp(data,expression,'tokens','match');
sampleFrequency = cellfun(@(x) x(1,1),tokens);
position = cellfun(@(x) x(1,2),tokens);
Position and sampleFrequency are both 1x183 cell arrays and contain the data you are interested in.
position = {'-10000' '-9000' '-8000' '-7000' '-6000' '-5000' '-4500' '-4000' '-3500' '-3000' ................}
sampleFrequency = {'22000' '22000' '22000' '22000' '22000' '22000' '22000' '22000' '22000' '22000' .................}

  0 Comments

Sign in to comment.


Answer by per isakson
on 15 Jun 2018
Edited by per isakson
on 15 Jun 2018

Is this what you are looking for?
fid = fopen( '00000090Head.txt', 'r' );
cac = textscan( fid, '%*s%f%*f%*f%*f%*s%*f%f%*f', 'Headerlines',1,'Delimiter',';' );
fclose( fid );
and inspect the result
>> cac
cac =
1×2 cell array
{183×1 double} {183×1 double}
>> cac{2}(1:3)
ans =
-10000
-9000
-8000

  0 Comments

Sign in to comment.


Answer by Ana Maria Alzate on 15 Jun 2018

Yes, but is not giving me the position, it is giving me the lst column, the recording time

  5 Comments

Shifts like this one should not be a problem. (This is the only "shift" I find in the sample file.)
  • Do have problems reading the uploaded sample file?
  • Why not upload a file, which causes problems.
  • Do you get any error or warning messages?
Jan
on 15 Jun 2018
@Ana Maria Alzate: Please do not post comments in the section for answers in the future. There is a section for comments for this job. Thanks.
Thank you for the advice!

Sign in to comment.


Answer by Stephen Cobeldick on 18 Jun 2018
Edited by Stephen Cobeldick on 18 Jun 2018

Importing the data as strings and then using regular expressions to parse them is inefficient, yet is not required because that file is very nicely formatted in delimited columns, and the required data can easily and efficiently be read directly as numeric (or char). The command textscan makes it easy specify how to read those columns, and the format string is much simpler and more intuitive that those regular expressions:
>> fmt = '%*s%f%*d%*d%*d%*s%*f%f%*f';
>> opt = {'HeaderLines',1,'Delimiter',';'};
>> [fid,msg] = fopen('00000090Head.txt','rt');
>> assert(fid>=3,msg)
>> C = textscan(fid,fmt,opt{:});
>> fclose(fid);
>> [C{:}]
ans =
22000 -10000
22000 -9000
22000 -8000
22000 -7000
22000 -6000
22000 -5000
22000 -4500
22000 -4000
22000 -3500
22000 -3000
22000 -3000
22000 -2500
22000 -2000
22000 -1500
22000 -1000
22000 -500
22000 0
22000 500
22000 1000
22000 1500
... lots of lines here
22000 -3000
22000 -2500
22000 -2000
22000 -1500
22000 -1000
22000 -500
22000 0
22000 500
22000 1000
22000 1500
22000 2000
22000 2500
22000 3000
22000 3500
22000 4000
22000 4000
22000 5000

  0 Comments

Sign in to comment.