Path: news.mathworks.com!newsfeed-00.mathworks.com!nlpi057.nbdc.sbc.com!prodigy.net!news.glorb.com!postnews.google.com!h5g2000yqh.googlegroups.com!not-for-mail
From: Sergioval <SerTeckian@gmail.com>
Newsgroups: comp.soft-sys.matlab
Subject: Re: Matching data based on columns
Date: Sat, 14 Mar 2009 22:28:38 -0700 (PDT)
Organization: http://groups.google.com
Lines: 84
Message-ID: <14ae1837-01b5-4bde-939c-b5de4b7b2247@h5g2000yqh.googlegroups.com>
References: <gphqm6$eh5$1@fred.mathworks.com>
NNTP-Posting-Host: 71.182.244.104
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-Trace: posting.google.com 1237094918 21264 127.0.0.1 (15 Mar 2009 05:28:38 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Sun, 15 Mar 2009 05:28:38 +0000 (UTC)
Complaints-To: groups-abuse@google.com
Injection-Info: h5g2000yqh.googlegroups.com; posting-host=71.182.244.104; 
	posting-account=AYu8sgkAAABb0aN95LV0lBEOYoipCXEs
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/5.0 (Windows; U; Windows NT 6.0; es-ES; rv:1.9.0.7) 
	Gecko/2009021910 Firefox/3.0.7 (.NET CLR 3.5.30729),gzip(gfe),gzip(gfe)
Xref: news.mathworks.com comp.soft-sys.matlab:524952

Hi, I am not sure to understand what you mean with "the conditions of
the first four columns".

In the case you want to emulate the function in Excel VLOOKUP
(1,a2:c10,2), you could do something like:
  A(2, find(A(1,:)<=3DA(1,1),1) )

So for matching the first three columns you just have to remove the
constraint in the function find to work out only one result.
  [c,d]=3Dfind(A(:,1:3) - A(ones(size(A,1),1),1:3) =3D=3D 0);

Pay attention to how I am substracting the matrices. As they have to
be the same size I am replicating the first row with function "ones".
If you prefer to search approximately instead of exactly, just
substitute the =3D=3D for a choosen tolerance or set the right logical
condition.

Then get only the useful (not repeated) rows which are:
  c=3Dc(1:find(d~=3Dd(1),1)-1);

Finally, calculate the mean:
  B=3Dzeros(size(A,1)-length(c)+1,size(A,2));
  B(1,:) =3D mean(A(c,:));

And add the rest of the matrix:
  A(c,:)=3D[];
  B(2:1+size(A,1),:)=3DA;


Summaryzing, if I understood your goal, this code should help:

[c,d]=3Dfind(A(:,1:3) - A(ones(size(A,1),1),1:3) =3D=3D 0);
c=3Dc(1:find(d~=3Dd(1),1)-1);
B=3Dzeros(size(A,1)-length(c)+1,size(A,2));
B(1,:) =3D mean(A(c,:));
A(c,:)=3D[];
B(2:1+size(A,1),:)=3DA;


Cheers!
Sergio



On 14 mar, 22:55, "R L" <rcw...@hotmail.com> wrote:
> Hi all,
>
> I apologize in advance if this is a novice and trivial question but I wou=
ld appreciate all the help I can get! I must have spent at least three days=
 on trying to solve this problem and poured many hours on googling this sol=
ution but still to no avail.
>
> I want to sort data in the following format:
>
> A =3D [1 1 3 4 98 97 96; ...
> =A0 =A0 =A0 =A01 1 3 4 98 93 92; ...
> =A0 =A0 =A0 =A01 1 3 4 92 94 90; ...
> =A0 =A0 =A0 =A02 3 4 5 79 88 90; ...
> =A0 =A0 =A0 =A02 4 5 7 89 78 99; ...
> =A0 =A0 =A0 =A02 6 9 7 89 78 99];
>
> I want to sort the data based on the conditions of the first four columns=
 of the matrix A. Moreover, for the rows that match the criteria, is it pos=
sible to get an average? Perhaps it is easier to explain with an example. T=
hat is, the resulting matrix should look like the following:
>
> B =3D [1 1 3 4 96 94.7 92.7; ...
> =A0 =A0 =A0 =A02 3 4 5 79 88 90; ...
> =A0 =A0 =A0 =A02 4 5 7 89 78 99; ...
> =A0 =A0 =A0 =A02 6 9 7 89 78 99];
>
> where
> 96 =3D 1/3 * (98 + 98 + 92)
> 94.7 =3D 1/3 * (97 + 93 + 94)
> 92.7 =3D 1/3 * (96 + 92 + 90) =A0
> and whether the values in the last three columns match or not is irreleva=
nt (i.e. as illustrated by the last two rows of matrix A). =A0
>
> Thank you so much!
>
> PS. In effect, this is sort of like Excel's vlookup function (not quite, =
since I want three elements to match whereas Excel only allows for one and =
also that I want a calculation based on the matches).