Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
Extracting data from tab separated data files

Subject: Extracting data from tab separated data files

From: Catalin Eberhardt

Date: 8 Apr, 2010 08:44:05

Message: 1 of 7

Hi everyone,

I have a bunch of tab-separated data files produced by a script that acquires data in a Psychology experiment. I would like to be able to access the numerical data in each file easily, and the first step is probably to extract a big matrix from each file.

Unfortunately, the script created rather sloppy data files, (lots of text descriptions, etc), so a bit of "filtering" has to be done before arriving at a "clean" matrix (i.e. only numerical data, with maybe a header for each column). A sample data file is located here: http://www.2shared.com/file/12478975/d6307cec/data.html (use the "Save file to your PC" link)

Can anyone suggest what would be an easy way of doing this?

I guess in order the automate the process of data retrieval, the function 'dlmread' would have to be used in a script, and to test how it would work for a single file, I used the Import Data wizard; the wizard automatically (and correctly) detects 5 lines of text header, but then proceeds to create a Data variable (matrix) that only contains 1 row.

Any help would be appreciated, many thanks in advance.

Subject: Extracting data from tab separated data files

From: us

Date: 8 Apr, 2010 08:55:09

Message: 2 of 7

"Catalin Eberhardt" <longtalker@gmail.com> wrote in message <hpk50l$kub$1@fred.mathworks.com>...
> Hi everyone,
>
> I have a bunch of tab-separated data files produced by a script that acquires data in a Psychology experiment. I would like to be able to access the numerical data in each file easily, and the first step is probably to extract a big matrix from each file.
>
> Unfortunately, the script created rather sloppy data files, (lots of text descriptions, etc), so a bit of "filtering" has to be done before arriving at a "clean" matrix (i.e. only numerical data, with maybe a header for each column). A sample data file is located here: http://www.2shared.com/file/12478975/d6307cec/data.html (use the "Save file to your PC" link)
>
> Can anyone suggest what would be an easy way of doing this?
>
> I guess in order the automate the process of data retrieval, the function 'dlmread' would have to be used in a script, and to test how it would work for a single file, I used the Import Data wizard; the wizard automatically (and correctly) detects 5 lines of text header, but then proceeds to create a Data variable (matrix) that only contains 1 row.
>
> Any help would be appreciated, many thanks in advance.

due to firewall restrictions we cannot download the file from the site...

us

Subject: Extracting data from tab separated data files

From: Catalin Eberhardt

Date: 8 Apr, 2010 10:07:06

Message: 3 of 7

I uploaded it here too: http://longtalker.20x.cc/data.csv

If that doesn't work either, could you please suggest another way of uploading it that will work even with a firewall, thanks.

Subject: Extracting data from tab separated data files

From: Catalin Eberhardt

Date: 8 Apr, 2010 12:03:21

Message: 4 of 7

OK I've made some developments on my own in the meantime, I tried using the built-in function textscan but that function only allows one use of the CommentStyle parameter, therefore I cannot define several text lines for it to ignore.

I also tried using txt2mat, from the File Exchange, but that doesn't produce the expected result, i.e. it also ignores numerical lines in the data file, aside from the text lines.

Subject: Extracting data from tab separated data files

From: Andres

Date: 8 Apr, 2010 20:49:24

Message: 5 of 7

"Catalin Eberhardt" <longtalker@gmail.com> wrote in message <hpkgm9$5ve$1@fred.mathworks.com>...
> OK I've made some developments on my own in the meantime, I tried using the built-in function textscan but that function only allows one use of the CommentStyle parameter, therefore I cannot define several text lines for it to ignore.
>
> I also tried using txt2mat, from the File Exchange, but that doesn't produce the expected result, i.e. it also ignores numerical lines in the data file, aside from the text lines.

Hi Catalin,
I've had a look at your file in the meantime. It has only 5kB, and a quite regular structure of mostly alternating descriptive lines and numerical data lines. If you are really interested in the numerical data only, see my reply on the txt2mat file exchange page.

I could imagine however you'd like to get more information out of the file - then I'd recommend to parse the file with some customized code relying e.g. on fgetl and textscan and to store the data in a more appropriate way, e.g. using structs. Ideally that code would be applicable to each of your files...
Good luck!

Subject: Extracting data from tab separated data files

From: Catalin Eberhardt

Date: 8 Apr, 2010 21:39:06

Message: 6 of 7

Hi Andres,

Many thanks for replying! In this case, I could define as a "bad line" any line that contains either ":", "test" or "=". I tried the command

A=txt2mat('d:\data.csv','BadLineString',{[':' '=' 'test']})

but this produced

A =
  Columns 1 through 12
    4.8086 1.6040 2.1617 -0.4620 0.6502 -3.0033 -4.5974 3.8317 1.5116 1.8449 -0.5677 0.2805

  Columns 13 through 14
   -2.4620 -4.3333

Trying with just ':' also produced a different matrix than expected.

It is the case that I only need the numbers from these identically-formated data files, so if I could just eliminate all the lines containing one of those strings, I would be happy :)

Subject: Extracting data from tab separated data files

From: Catalin Eberhardt

Date: 8 Apr, 2010 21:49:19

Message: 7 of 7

Solved it - your solution

A = txt2mat('example.txt',0,'ReadMode','line')
A = A(any(isfinite(A),2),:)

was all that was needed! Many thanks for that indeed! Now I have the grueling task of reorganising that matrix in a more logical and less redundant way :)

Tags for this Thread

No tags are associated with this thread.

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us