Path: news.mathworks.com!not-for-mail
From: <HIDDEN>
Newsgroups: comp.soft-sys.matlab
Subject: Re: Matching data based on columns
Date: Sun, 15 Mar 2009 11:31:01 +0000 (UTC)
Organization: The MathWorks, Inc.
Lines: 53
Message-ID: <gpiotl$a3u$1@fred.mathworks.com>
References: <gphqm6$eh5$1@fred.mathworks.com> <gpif2l$rg2$1@fred.mathworks.com>
Reply-To: <HIDDEN>
NNTP-Posting-Host: webapp-05-blr.mathworks.com
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
X-Trace: fred.mathworks.com 1237116661 10366 172.30.248.35 (15 Mar 2009 11:31:01 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Sun, 15 Mar 2009 11:31:01 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 1756295
Xref: news.mathworks.com comp.soft-sys.matlab:524978

"Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in message <gpif2l$rg2$1@fred.mathworks.com>...
> "R L" <rcwlhk@hotmail.com> wrote in message <gphqm6$eh5$1@fred.mathworks.com>...
> > Hi all, 
> > 
> > I apologize in advance if this is a novice and trivial question but I would appreciate all the help I can get! I must have spent at least three days on trying to solve this problem and poured many hours on googling this solution but still to no avail. 
> > 
> > I want to sort data in the following format: 
> > 
> > A = [1 1 3 4 98 97 96; ...
> >        1 1 3 4 98 93 92; ... 
> >        1 1 3 4 92 94 90; ... 
> >        2 3 4 5 79 88 90; ... 
> >        2 4 5 7 89 78 99; ...
> >        2 6 9 7 89 78 99]; 
> > 
> > I want to sort the data based on the conditions of the first four columns of the matrix A. Moreover, for the rows that match the criteria, is it possible to get an average? Perhaps it is easier to explain with an example. That is, the resulting matrix should look like the following: 
> > 
> > B = [1 1 3 4 96 94.7 92.7; ...
> >        2 3 4 5 79 88 90; ... 
> >        2 4 5 7 89 78 99; ...
> >        2 6 9 7 89 78 99]; 
> > 
> > where
> > 96 = 1/3 * (98 + 98 + 92) 
> > 94.7 = 1/3 * (97 + 93 + 94) 
> > 92.7 = 1/3 * (96 + 92 + 90)  
> > and whether the values in the last three columns match or not is irrelevant (i.e. as illustrated by the last two rows of matrix A).   
> > 
> > Thank you so much! 
> > 
> > PS. In effect, this is sort of like Excel's vlookup function (not quite, since I want three elements to match whereas Excel only allows for one and also that I want a calculation based on the matches). 
> 
>   The following should do the job for you, R L, if I have understood you correctly.  I am not completely satisfied with it because it does a sort twice, once with 'sortrows' and the second time with 'unique', but so far I haven't figured a way around this.  We assume the N by p+q array A is given:
> 
>  p = 4; % no. cols. to check;
>  [N,q] = size(A);
>  q = q-p; % no. cols to do averaging on
>  T = sortrows(A,1:p); % Sort according to first p columns
>  [B,m,n] = unique(T(:,1:p),'rows'); % Make 1st p cols. unique
>  r = repmat(n,q,1); % Row nos. for accumarray
>  c = reshape(repmat(1:q,N,1),[],1); % Col. nos. for accumarray
>  T = reshape(T(:,p+1:p+q),[],1); % To be added by accumarray
>  T = accumarray([r,c],T)); % Do the required addition
>  d = repmat(diff(find([true;diff(n)~=0;true])),1,q); % Get divisor
>  B = [B,T./d]; % Append mean values in last q columns
> 
> B is the desired result.
> 
> Roger Stafford

Thank you to both of you! This is exactly the result that I need! Very efficient and simple! Great help to my project! 

R L