Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
determine col num from column headers

Subject: determine col num from column headers

From: Keli

Date: 11 Aug, 2008 19:02:03

Message: 1 of 5

Hello!
I have 5 data sets containing 25 files each with 465 to
517 columns, and I need to read 4 specific columns from
each file. The files are loaded from *.csv into
seperate 'colheader' string array and the data matrix.
While each data set tends to have the same number of
columns, thus far I need to manually view 'colheader' data
string from each file in the data set to be certain the
target columns are indeed the same column number. Once I
am certain of the column number I load this column into a
seperate vector to use in other Matlab operations. Any
suggestions on how can I determine the column number? Can
I use the column headers (the 'colheader' string) to
locate the specific column numbers?


Thanks much!

Subject: determine col num from column headers

From: Andres

Date: 12 Aug, 2008 08:55:05

Message: 2 of 5

"Keli " <krishnakeli@gmail.com> wrote in message
<g7q2bb$ffi$1@fred.mathworks.com>...
> Hello!
> I have 5 data sets containing 25 files each with 465 to
> 517 columns, and I need to read 4 specific columns from
> each file. The files are loaded from *.csv into
> seperate 'colheader' string array and the data matrix.
> While each data set tends to have the same number of
> columns, thus far I need to manually view 'colheader'
data
> string from each file in the data set to be certain the
> target columns are indeed the same column number. Once I
> am certain of the column number I load this column into a
> seperate vector to use in other Matlab operations. Any
> suggestions on how can I determine the column number? Can
> I use the column headers (the 'colheader' string) to
> locate the specific column numbers?

Hi,

we need more information about your colheader string, but
here's a hint to an automation routine that might work, as
a best guess:

You know the exact distinct column title strings appearing
in your header line string

    myColumnNames = {'Val1','Val2','Val3','Val4'};

If your header string contains more than one line, you
should extract the line that contains your column names.
Maybe you can identify the lines by the positions of a line
break character:

    lineBreakPos = strfind(colheader,char(10));

Find out somehow which line has the names, perhaps you know
it beforehand. Assuming the k-th line is the line of
interest, do something like:

    columnNameLine = colheader(lineBreakPos(k-1)+1:
lineBreakPos(k)-1 );

(take obvious precautions if it is the first or last line
in colheader)

Most probably there is a distinct delimiter (Tab,
space, ';', ',' ...) between the column name strings. Find
its positions in the line ...

    delim = ';';
    delimPos = strfind(columnNameLine,delim);

find out the positions of your column name strings ...

    namePos(n) = strfind(columnNameLine,myColumnNames(n));

and compare them to the delimiter positions to make out the
corresponding column index of your data matrix.

I hope this gets you started.
Best regards
Andres

Subject: determine col num from column headers

From: Keli

Date: 12 Aug, 2008 19:58:01

Message: 3 of 5

"Andres " > wrote in message
> <g7q2bb$ffi$1@fred.mathworks.com>...
> > Hello!
> > I have 5 data sets containing 25 files each with 465
to
> > 517 columns, and I need to read 4 specific columns
from
> > each file. The files are loaded from *.csv into
> > seperate 'colheader' string array and the data
matrix.
> > While each data set tends to have the same number of
> > columns, thus far I need to manually view 'colheader'
> data
> > string from each file in the data set to be certain
the
> > target columns are indeed the same column number.
Once I
> > am certain of the column number I load this column
into a
> > seperate vector to use in other Matlab operations. Any
> > suggestions on how can I determine the column number?
Can
> > I use the column headers (the 'colheader' string) to
> > locate the specific column numbers?
>
> Hi,
>
> we need more information about your colheader string,
but
> here's a hint to an automation routine that might work,
as
> a best guess:
>
> You know the exact distinct column title strings
appearing
> in your header line string
>
> myColumnNames = {'Val1','Val2','Val3','Val4'};
>
> If your header string contains more than one line, you
> should extract the line that contains your column names.
> Maybe you can identify the lines by the positions of a
line
> break character:
>
> lineBreakPos = strfind(colheader,char(10));
>
> Find out somehow which line has the names, perhaps you
know
> it beforehand. Assuming the k-th line is the line of
> interest, do something like:
>
> columnNameLine = colheader(lineBreakPos(k-1)+1:
> lineBreakPos(k)-1 );
>
> (take obvious precautions if it is the first or last
line
> in colheader)
>
> Most probably there is a distinct delimiter (Tab,
> space, ';', ',' ...) between the column name strings.
Find
> its positions in the line ...
>
> delim = ';';
> delimPos = strfind(columnNameLine,delim);
>
> find out the positions of your column name strings ...
>
> namePos(n) = strfind(columnNameLine,myColumnNames
(n));
>
> and compare them to the delimiter positions to make out
the
> corresponding column index of your data matrix.
>
> I hope this gets you started.
> Best regards
> Andres

Andres,
thanks for your reply.
This is probably too much detail but here it is:
The original .csv file contains about 23 header lines, but
I delete these so the first line is, infact, the column
names (extra header lines will not import properly). The
file is then imported into Matlab and then save as a
matrix. Usually when I import csv files I have a matrix
and 2 'arrays', one called "textdata" and the other
called "colheaders". In this case one array was the
actual column names/headers (e.g. 'Torque') and the other
was the units for the column headers (e.g. 'Nm'). For
some reason this is not working and I have to re-import
the column headers as its own array - but this isn't
really an issue.

A better example:
I have just loaded a data file with a matrix
called "datafile1" (sized at <148x482 double>), and the
corresponding column names is in "colheaders" array (size
<1x482 cell>). I know the exact column names, but I must
open the colheaders array and scroll to find where a
target column is, since the data matrix doesn't have
column names.
   For example: the column name/header "Torque" is in
column # 422 of the 'colheader' array. The corresponding
torque data is in the matrix "datafile1" column no 422.
The end goal is to use this column number (422) to read
the Torque data into a seperate vector to be further
manipulated later. If I double click on the colheaders
cell named 'Torque', the array editor title is colheaders
{1,422}.

Herein lies the problem. I have to verify all files in
this data set have the 'Torque' data stored in column 422 -
or determine which column it is stored in. Once
determined, the torque data from each file will be saved
as a seperate variable. If I only had to manual verify
this in 1 data set (approx 25 files), it would be okay.
Currently however, it is 125 files, with more on the way.

Also: I believe the string data in column headers is
delimited by single quote (') not semicolon (;). So
modifing the code to " delim = '''; " does not work. Am
also unsuccessful with strfind:

strfind(colheaders,'Torque')
ans =

  Columns 1 through 9

     [] [] [] ...
  Columns 442 through 450

     [] [] [] [] []
etc.


Hope this helps expain what is going on. Thanks so much
for all your time.

Subject: determine col num from column headers

From: Andres

Date: 13 Aug, 2008 07:58:02

Message: 4 of 5

"Keli " <kalark@ford.com> wrote in message <g7sq09$rj2
$1@fred.mathworks.com>...
[..]
> A better example:
> I have just loaded a data file with a matrix
> called "datafile1" (sized at <148x482 double>), and the
> corresponding column names is in "colheaders" array (size
> <1x482 cell>). I know the exact column names, but I must
> open the colheaders array and scroll to find where a
> target column is, since the data matrix doesn't have
> column names.
> For example: the column name/header "Torque" is in
> column # 422 of the 'colheader' array. The corresponding
> torque data is in the matrix "datafile1" column no 422.
> The end goal is to use this column number (422) to read
> the Torque data into a seperate vector to be further
> manipulated later. If I double click on the colheaders
> cell named 'Torque', the array editor title is colheaders
> {1,422}.
>
> Herein lies the problem. I have to verify all files in
> this data set have the 'Torque' data stored in column
422 -
> or determine which column it is stored in. Once
> determined, the torque data from each file will be saved
> as a seperate variable. If I only had to manual verify
> this in 1 data set (approx 25 files), it would be okay.
> Currently however, it is 125 files, with more on the way.
>
> Also: I believe the string data in column headers is
> delimited by single quote (') not semicolon (;). So
> modifing the code to " delim = '''; " does not work. Am
> also unsuccessful with strfind:
>
> strfind(colheaders,'Torque')
> ans =
>
> Columns 1 through 9
>
> [] [] [] ...
> Columns 442 through 450
>
> [] [] [] [] []
> etc.
>
>
> Hope this helps expain what is going on. Thanks so much
> for all your time.
>

So your colheaders variable is a cell array of strings.
Then I'd expect a non-empty element somewhere in the
resulting cell array of strfind.
Besides, use find(strcmp( rather than strfind( on the cell
array. Perhaps you have to care about some extra whitespace
around your column names.
The following example works:

    colheaders = {' Freq',' Torque ','Temp '};
    cleanColheaders = strtrim(colheaders);
    idx = find(strcmp('Torque',cleanColheaders));
    % idx = 2

Check if you need to use strcmpi instead of strcmp.
Does this work for you as well? If not, can you find out
why?

Subject: determine col num from column headers

From: Keli

Date: 14 Aug, 2008 13:32:10

Message: 5 of 5

Fantastic! Thanks very much!

Kinda Regards,
Keli

> [..]
>
> So your colheaders variable is a cell array of strings.
> Then I'd expect a non-empty element somewhere in the
> resulting cell array of strfind.
> Besides, use find(strcmp( rather than strfind( on the
cell
> array. Perhaps you have to care about some extra
whitespace
> around your column names.
> The following example works:
>
> colheaders = {' Freq',' Torque ','Temp '};
> cleanColheaders = strtrim(colheaders);
> idx = find(strcmp('Torque',cleanColheaders));
> % idx = 2
>
> Check if you need to use strcmpi instead of strcmp.
> Does this work for you as well? If not, can you find out
> why?

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us